conaticus/FileExplorer

Dev build crashes on macOS

FormalSnake opened this issue · 49 comments

cargo tauri dev

Running BeforeDevCommand (yarn dev)
yarn run v1.22.19
warning ../../package.json: No license field
$ vite

VITE v4.3.9 ready in 408 ms

➜ Local: http://127.0.0.1:1420/
➜ Network: use --host to expose
Info Watching /Users/kyandesutter/development/FileExplorer/src-tauri for changes...
Compiling file-explorer v0.0.0 (/Users/kyandesutter/development/FileExplorer/src-tauri)
Finished dev [unoptimized + debuginfo] target(s) in 4.08s
thread 'main' panicked at 'called Result::unwrap() on an Err value: Error("EOF while parsing a value", line: 1, column: 0)', src/filesystem.rs:176:63
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
fatal runtime error: failed to initiate panic, error 5

Note from @conaticus for developers:
I have implemented a mechanism to stop the program from crashing if it is unable to deserialize the cache, but it leads to the program recaching every time it is re-run. Look out for a log message to see if the deserialization failed. There is also a panic on this line in the search.rs when searching on MacOS:

image

This is likely a string mismatch causing the program to fail when retrieving the volume from the cache.

hey, can you try deleting your cache and trying again and tell me whether it worked for you or not? There has been some changes to the cache format that would render your cache invalid. Your cache is stored in src-tauri/system_cache.json.

The cache issue should come up again as we're working on changing to a binary format.

The problem with that is the window becomes blank

and unresponsive.

@FormalSnake yes that's by design. the window freezes for a minute or two while caching cause we haven't implemented the loading screen and the caching is for now blocking.
wait for a lil bit and you should see the file explorer working. you can expect the long wait to be gone in the future.

I have let it run for an hour now and it still isn't done, and didn't write anything.

it should only take a couple minutes at best. something must be going wrong here. I'll see what we can do, I don't have a mac tho so I'll have to relying on other people's help.

This is likely a string mismatch causing the program to fail when retrieving the volume from the cache.

mount_pnt is actually an empty string "" on mac os, not sure why it would not be in the cache
Something about windows and unix having different file systems

This is likely a string mismatch causing the program to fail when retrieving the volume from the cache.

mount_pnt is actually an empty string "" on mac os, not sure why it would not be in the cache Something about windows and unix having different file systems

I'm on linux so it's definitely not a unix thing as it works for me and someone else on linux has tried it and it works as well. We're using a cross platform crate to retrieve mountpoints which are then stored in the cache. I'm not exactly sure whether this is an error on our part or if there's a bug on the crate itself.

it seems to me the current code is trying to call .unwrap() on a poisened mutex when accessing state in the run_cache_interval loop.

        loop {
            interval.tick().await;

            let mut guard = match state_clone.lock() {
                Ok(state) => state,
                Err(poison_error) => poison_error.into_inner(),
            };

            save_to_cache(&mut guard);
        }

did the trick (honestly feels dirty tho 👀 ).

EDIT: tested on MacOS M1 Ventura

If this works please make a pr, I am not familiar with rust so I do not know where to put this modification.

I'll look into the other issue you mentioned in the OP.

@HenryRoutson at first I experienced the same, but deleting the system_cache.json and resolved the issue.

From the docs:

[...] where a mutex is considered poisoned whenever a thread panics while holding the mutex. Once a mutex is poisoned, all other threads are unable to access the data by default as it is likely tainted (some invariant is not being upheld).

I've tested it, it works for me as well.
I think the issue is that the code keeps triggering the recaching without checking if the previous caching routine is done or not. If the caching takes more than 30 seconds it will spawn a new routine that will try to lock the mutex and panic causing the poisoning.

@conaticus to fix this we could use a channel to decide when to restart the caching if necessary.

Also formal snake double check you have run yarn and installed tauri-cli,
i did just cargo run out of intuition first time and got a blank screen

cargo install tauri-cli
yarn

Where should I put the provided code snippet?

paste this in your terminal

cargo install tauri-cli
yarn

and then run

cargo tauri dev

I know that, but I mean where do I put this:

`` loop {
interval.tick().await;

        let mut guard = match state_clone.lock() {
            Ok(state) => state,
            Err(poison_error) => poison_error.into_inner(),
        };

        save_to_cache(&mut guard);
    }``

So search for interval.tick and you should find a loop in cache.rs you want to replace

I am not sure but i do not think it is working

I deleted the cache and i am running it now.

it's been running for 9 minutes and it didn't add anything to the system_cache, and it is still unresponsive

@FormalSnake how much diskspace are you using? 👀

I have a 256gb m1 mbair and I have 15.5gb left 💀

it's been running for 9 minutes and it didn't add anything to the system_cache, and it is still unresponsive

there's no system cache file anymore, at least not on the repo. cache now goes to the appropriate location depends on the operating system. check this to know where it goes on your system.

it did create a new json for me tho. Is there a new commit?

I've tested it, it works for me as well. I think the issue is that the code keeps triggering the recaching without checking if the previous caching routine is done or not. If the caching takes more than 30 seconds it will spawn a new routine that will try to lock the mutex and panic causing the poisoning.

@conaticus to fix this we could use a channel to decide when to restart the caching if necessary.

Only reading this now.

There's no need to re-cache if another spawn is already doing just that. Accessing the poisoned mutex would create issues in the long run I suspect.

As with the example of @FormalSnake with a loaded Disk (and a fast chipset!, Imagine a movie buff with TBs and a Intel from 2012), I guess 30sec isn't enough and thus it kinda makes my PR obsolete 🥲

so is mine supposed to take a bit?

@FormalSnake try out

git pull origin/dev

the og dev branch has kind of gotten obliterated. I think you're running a few days old commit. Try running that command or just recloning the project (the lazy route).

it did create a new json for me tho. Is there a new commit?

If you're on the latest version it should cache to your system's cache directory. If it fails to read the cache file it will print a log message in the console.

As mentioned in the edit there are some known issues with MacOS that need to be addressed. We are waiting on a mac developer familiar with Rust to assist with this.

@RaphGL's idea is a valid problem that a new issue should be created for however I am not convinced this is neccesarily the problem @FormalSnake is experiencing.

@FormalSnake please could you pull the latest version on the dev branch and let me know the following:

  • Any panics that occur
  • If there is a log message ever printed saying Failed to deserialize the cache from disk, recaching...
  • Whether there is a file named file-explorer.cache.bin in your /Users/{Your Name}/Library/Caches
  • The file size of the file-explorer.cache.bin if it exists

Thanks

As with the example of @FormalSnake with a loaded Disk (and a fast chipset!, Imagine a movie buff with TBs and a Intel from 2012), I guess 30sec isn't enough and thus it kinda makes my PR obsolete smiling_face_with_tear

@jokorone you still helped figuring out what the issue was tho :)
you could work on synchronizing the thread before respawning if you feel like it and report your findings

What exactly is the idea with updating the cache in an interval?

Seems to me this eats up some memory? Not sure if neglectible.
A watcher would do the same thing, but only run when something actually changed.

It loaded really quickly now! It opened to a Loadin... screen and it created a bin file in my cache folder

What exactly is the idea with updating the cache in an interval?

Seems to me this eats up some memory? Not sure if neglectible. A watcher would do the same thing, but only run when something actually changed.

The cache is stored on disk and this is a very big file that takes a few seconds to update. Filesystem changes actually occur in the background very frequently and this would invoke the program to write the file too often. It's not a perfect solution but a temporary one. I am about to open an issue about the 30s delay edge case.

It loaded really quickly now! It opened to a Loadin... screen and it created a bin file in my cache folder

Great! Any errors or crashes? Does the searching work okay?

it sstill loading currently

I am going away currently so I will say if it loaded or not.

I am going away currently so I will say if it loaded or not.

If it has taken this long then @RaphGL is correct. Sounds like it's trying to recache before finishing the first cache resulting in an infinite loop.

It is still going

I do not have any errors

Even on slower systems this shouldn't take longer than 5 minutes. As mentioned it will likely be an infinite loop happening here. Thanks for your help and I apologise that this isn't working for you currently.

No need to apologize. I hope we figure it out

After a few hours it did manage to cache!

searching is bad tho:

thread 'tokio-runtime-worker' panicked at 'called Option::unwrap() on a None value', src/search.rs:85:59
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/search.rs:82:34
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/search.rs:82:34
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/search.rs:82:34
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/search.rs:82:34
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/filesystem/cache.rs:139:49
thread 'notify-rs fsevents loop' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/filesystem/cache.rs:48:48
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/search.rs:82:34
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/search.rs:82:34
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/search.rs:82:34
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/search.rs:82:34
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/search.rs:82:34
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/search.rs:82:34
thread 'tokio-runtime-worker' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', src/search.rs:82:34

And restarting the app makes it go back to Loading...

@FormalSnake if you start searching before entering a disk/volume you'll get a panic, is that what u did?

it also doesnt work inside a volume

We should probably wait for the upcoming work on the search stuff before checking that I think.
I think this issue can now be closed as you've already managed to build and run the program.
Please create an issue if you get this error after the search refactor/rewrite is merged.

No problem! Thanks for all the support!

This still needs addressing in future