blekhmanlab/compendium

Add ability to restart a failed project with `matchids` option

Opened this issue · 0 comments

In some cases, the forward and reverse read files don't match exactly, apparently because they were filtered separately. The DADA2 matchids option will attempt to pair them up. It'd be nice if we had a way to restart a project with matchids enabled, in situations where there are a practical number of reverse reads and it'd be better than just re-running as single-end.

My first guess is by adding optional flags to the again command.

Example from project PRJNA639644:

Loading required package: Rcpp
Read 227 items
[1] "Thu Jan 12 21:11:08 2023 Paired-end data found!"
[1] "Thu Jan 12 21:11:08 2023 Filtering..."
Error in filterAndTrim(forward_reads, filtered_forward_reads, reverse_reads,  :
  These are the errors (up to 5) encountered in individual cores...
Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  :
  Mismatched forward and reverse sequence files: 39797, 39795.
Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  :
  Mismatched forward and reverse sequence files: 47109, 47108.
Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  :
  Mismatched forward and reverse sequence files: 7700, 7696.
Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  :
  Mismatched forward and reverse sequence files: 39797, 39795.
Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  :
  Mismatched forward and reverse sequence files: 47109, 47108.
In addition: Warning message:
In mclapply(seq_len(n), do_one, mc.preschedule = mc.preschedule,  :
  scheduled cores 1, 5, 7 encountered errors in user code, all values of the jobs will be affected
Execution halted