PoC-Consortium/engraver

Engraver crashes after allocating plotfile on an NTFS partition (linux)

Closed this issue · 10 comments

Not exactly sure what to make of this, but here's the last few lines of output with backtrace.

Numeric ID:  5373760232911902050
Start Nonce: 0
Nonces:      18922368 (rounded to sector size for fast direct i/o)
Output File: /media/golyalpha/Plot1/5373760232911902050_0_18922368

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }', libcore/result.rs:1009:5
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
   1: std::sys_common::backtrace::print
   2: std::panicking::default_hook::{{closure}}
   3: std::panicking::default_hook
   4: std::panicking::rust_panic_with_hook
   5: std::panicking::continue_panic_fmt
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::result::unwrap_failed
             at libcore/macros.rs:26
   9: <core::result::Result<T, E>>::unwrap
             at libcore/result.rs:808
  10: engraver::utils::preallocate
             at src/utils.rs:119
  11: engraver::plotter::Plotter::run
             at src/plotter.rs:234
  12: engraver::main
             at src/main.rs:246
  13: std::rt::lang_start::{{closure}}
             at libstd/rt.rs:74
  14: std::panicking::try::do_call
  15: __rust_maybe_catch_panic
  16: std::rt::lang_start_internal
  17: std::rt::lang_start
             at libstd/rt.rs:74
  18: main
  19: __libc_start_main
  20: _start
Fast file pre-allocation...%

And here's the file it created.
screenshot from 2018-12-28 23-38-16

Same happens to me with the latest pre-built binary for Linux.

Also, running cargo test passes.

can you share the command you've used to start engraver? what file-system are you running on your drive? will have a look then.

The filesystem should have been either ext4 or NTFS , but ext4 more likely.
The command: ./engraver_cpu -n 18922400 -s 0 -i 5373760232911902050 -p /media/golyalpha/Plot1
It's the same for both the precompiled binary and the locally built one (except for the executable name).

Well, so, now that we manage to start hashing (on an ext4 partition, NTFS still crashes), this happens (I'm suspecting it's at the start of a write, might be wrong, not a rust developer):

./engraver -n 20775476 -s 0 -i 5373760232911902050 -p /media/golyalpha/Plot
Engraver 2.2.0 - PoC2 Plotter

CPU: Intel(R) Core(TM) i3-3110M CPU @ 2.40GHz [using 4 of 4 cores + AVX]
RAM: Total=7.68 GiB, Free=2.15 GiB, Usage=2.09 GiB
Numeric ID:  5373760232911902050
Start Nonce: 0
Nonces:      20775424 (rounded to sector size for fast direct i/o)
Output File: /media/golyalpha/Plot/5373760232911902050_0_20775424

Fast file pre-allocation...OK
Starting plotting...

Hashing: │█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░│ 0.02 % 18.29 MB/s 4732m 
Writing: │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░│ 0.00 % 0 B/s  
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 22, kind: InvalidInput, message: "Invalid argument" }', libcore/result.rs:1009:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
   1: std::sys_common::backtrace::print
   2: std::panicking::default_hook::{{closure}}
   3: std::panicking::default_hook
   4: std::panicking::rust_panic_with_hook
   5: std::panicking::continue_panic_fmt
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::result::unwrap_failed

Precompiled binaries don't suffer from this.

Almost there! It’s directi i/o again. Invalid argument = direct i/o fails.

Please try the -d switch to disable direct i/o.

And one more thing: plotting ntfs on a *nix system is not working well. Ntfs has some security features not allowing quick file allocation until you have special rights. These right can only be obtained under windows. As a result, ntfs file allocation on nix takes ages (file will be created and zeros will be written, 8TB can take 12h+)

Is it possible to skip fast file allocation if FS is NTFS and platform is *nix?
Also, that crash I showed you (my last comment) is happening on an ext4 partition.

Also, about fast file allocation on NTFS, that file in the screenshot in the OP took about the same time as it does for 5.4 TB file on an ext4 partition.

very difficult for ntfs under *nix. I think the driver will automatically use the slow allocation. The only way to have it fast is to pre-allocate it on a windows machine, stop the plotting and resume on nix.

will double check the creation on ext4.

Hi, update on this issue: in the current release, direct i/o is bugged for linux only. I‘ve created a fix in master and will soon compile a new release. The current linux release only works with direct i/o disabled.

fixed in 2.4.0