suppor for multi-fast5 input?
yjx1217 opened this issue · 5 comments
Hello, I have been testing Deepbinner-0.2.0 (github commit 886efc0) on our local server. It ran well for our older MinION data (single fast5 files; before the recent MinNKOW update) but encountered error for our new MinION data (multi-fast5 files; after the recent MinNKOW update). So I was wondering if deepbinner doesn't support multi-fast5 input yet. Thanks in advance! See below for the error message:
2018-12-28 11:55:39.805480: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Using TensorFlow backend.
Loading /home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/lib/python3.4/site-packages/deepbinner/models/EXP-NBD103_read_starts... done
Loading /home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/lib/python3.4/site-packages/deepbinner/models/EXP-NBD103_read_ends... done
Looking for fast5 files in /home/jxyue/Projects/LRSDAY-v1.3.0/Project_Jonas/00.Long_Reads/Basecalling_Guppy_out... 318 fast5s found
^MClassifying fast5s: 0 / 318 (0.0%)Traceback (most recent call last):
File "/home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/bin/deepbinner", line 11, in <module>
sys.exit(main())
File "/home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/lib/python3.4/site-packages/deepbinner/deepbinner.py", line 60, in main
classify(args)
File "/home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/lib/python3.4/site-packages/deepbinner/classify.py", line 46, in classify
output_size, args)
File "/home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/lib/python3.4/site-packages/deepbinner/classify.py", line 127, in classify_fast5_files
read_id, signal = get_read_id_and_signal(fast5_file)
File "/home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/lib/python3.4/site-packages/deepbinner/load_fast5s.py", line 27, in get_read_id_and_signal
read_group = list(hdf5_file['Raw/Reads/'].values())[0]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/lib/python3.4/site-packages/h5py/_hl/group.py", line 262, in __getitem__
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: 'Unable to open object (component not found)'
Loading classifications... done
0 total classifications found
Writing reads: 0 ^MTraceback (most recent call last):
File "/home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/bin/deepbinner", line 11, in <module>
sys.exit(main())
File "/home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/lib/python3.4/site-packages/deepbinner/deepbinner.py", line 64, in main
bin_reads(args)
File "/home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/lib/python3.4/site-packages/deepbinner/bin.py", line 32, in bin_reads
write_read_files(args.reads, classifications, out_filenames, input_type)
File "/home/jxyue/Projects/LRSDAY-v1.3.0/build/PATCH/Deepbinner-0.2.0/py3_virtualenv_deepbinner/lib/python3.4/site-packages/deepbinner/bin.py", line 146, in write_read_files
out_files[class_name].write(read_line_1)
KeyError: 'not found'
I also ran into the same problem. Any progression on this?
I have the same problem...
In case anyone is still looking for a solution, you can use multi_to_single_fast5
from https://github.com/nanoporetech/ont_fast5_api in order to convert multi fast5 to single fast5 and then run deepbinner
Hi all!
@bsaintjo gave the simplest answer - just turn your reads into single-read fast5s first. But that doesn't help if you want to run Deepbinner in realtime, so I just pushed up an update to make the deepbinner realtime
command work with multi-read fast5 files. It uses multi_to_single_fast5
internally, so you'll still need that installed.
See more info here in the README.
Ryan
Hi @rrwick ,
Excuse my ignorance, but I don't see how Deepbinner couldn't classify multi-fast5s? i.e you mention in the README
if one fast5 file contains reads from more than one barcode, then it cannot simply be moved into a bin
But each entry within a multi-fast5 has a unique ID so surely that is all you need? I.e in the classification file you just add the read ID and the barcode it maps to, then the read within the fastq output of guppy with that read ID just gets binned?
Obviously, it would require a change in your method of handling fast5 files though.