Better support for multi-threading
bovee opened this issue · 2 comments
I tried writing the following documentation on how to pass records off to e.g. a thread poll, but it doesn't work.
//! If you want to pass records off to a thread pool:
//! ```
//! # #[cfg(feature = "std")] {
//! use std::fs::File;
//! use rayon;
//! use entab::parsers::extract_opt;
//! use entab::readers::init_state;
//! use entab::readers::fastq::FastqRecord;
//!
//! let f = File::open("./tests/data/test.fastq")?;
//! let (mut rb, mut state) = init_state(f, None)?;
//! while let Some(slice) = rb.refill()? {
//! let consumed = &mut 0;
//! let eof = rb.eof;
//! rayon::scope(|s| {
//! while let Some(FastqRecord { id, ..}) = extract_opt(slice, eof, consumed, &mut state).unwrap() {
//! s.spawn(move |_| {
//! println!("{}", id);
//! });
//! }
//! Ok::<(), &str>(())
//! });
//! rb.consumed += *consumed;
//! }
//! # }
//! # use entab::EtError;
//! # Ok::<(), EtError>(())
//! ```
It threw the following errors:
error[E0499]: cannot borrow `state` as mutable more than once at a time
--> src/lib.rs:51:83
|
16 | rayon::scope(|s| {
| - has type `&Scope<'1>`
17 | while let Some(FastqRecord { id, ..}) = extract_opt(slice, eof, consumed, &mut state).unwrap() {
| ^^^^^^^^^^ `state` was mutably borrowed here in the previous iteration of the loop
18 | / s.spawn(move |_| {
19 | | println!("{}", id);
20 | | });
| |______________- argument requires that `state` is borrowed for `'1`
error[E0503]: cannot use `rb.eof` because it was mutably borrowed
--> src/lib.rs:49:15
|
13 | while let Some(slice) = rb.refill()? {
| ----------- borrow of `rb` occurs here
14 | let consumed = &mut 0;
15 | let eof = rb.eof;
| ^^^^^^ use of borrowed `rb`
16 | rayon::scope(|s| {
17 | while let Some(FastqRecord { id, ..}) = extract_opt(slice, eof, consumed, &mut state).unwrap() {
| ----- borrow later captured here by closure
The ergonomics here are still pretty bad (e.g. it would be nice to write rb.next_no_refill
?) but that might be because I haven't written any crossbeam/rayon/etc code lately.
Redid this quite a bit in 793c726 to use a BufferChunk
object that gets passed in and out and I think it's a bit better. You still need to wrap state
in a Mutex
and return the BufferChunk
from the rayon
scope, but it seems like it works (although I've only tested it on toy examples where the multithreading overhead totally kills any perf gains).
As a note, I'm considering the current API as a beta and I marked all the documentation for it as hidden until it gets trialed a little more (preferably by someone with an actual use case!)