rrwick/Deepbinner

Bug: v0.2.0 Deepbinner bin - "readclass not found"

Opened this issue · 3 comments

rekm commented

I am running deepbinner bin on already basecalled minion data.
The tool throws a KeyError("not found"), if an unclassified read id is encountered.

Data:
Flow-Cells: FLO-MIN106
Kit: SQK-RBK004

Classification was generated per run with:

deepbinner classify --rapid p_root/fast5/pass/{R_num}  1> {output} 2> {log}

Binning with:

deepbinner bin --classes {input.classi} --reads {input.pass_fastq} --out_dir {outdir}

For some runs the classification omits one read and this prompts the bin.py to switch into a previously untested or disabled section of code.

Maybe I just have an old version of the code.

enable_not_found_bin_patch.txt

Any chance this pull request will be merged anytime soon?

I have a large run (380GB of fast5) where I'm not done calling bases, but I want to take the (120GB) of fastq files I have base called already and debarcode them. It's throwing the KeyError("not found") for those fastq files not base called from the fast5 files yet.

ttubb commented

jrherr i dont know if this is a suitable solution for you, but:
If you have acquired deepbinner using git, you can merge the hotfix into your local repository using:
git pull origin pull/30/head

In case you need a docker container, i have just added the above line to mine. Its on Docker Hub: https://hub.docker.com/r/ttubb/deepbinner/dockerfile (currently waiting on Docker Hub to build the updated version, should be done in a few hours).

Thanks @ttubb! I needed a couple of days to implement, but this worked like a charm!

I did exactly as you said, merged the hotfix, and reinstalled deepbinner using pip. Worked great!