CPU mode hangs with macOS
Closed this issue · 6 comments
Hello,
I'm attempting to run ribodetector_cpu in macOS (i.e. no CUDA support) on a publicly available dataset. Python 3.8.13 ribodetector 0.2.6 on macOS 12.1. I followed the instructions for setting up the ribodetector conda environment plus dependancies for the cpu-only mode:
conda create -n ribodetector python=3.8
conda activate ribodetector
mamba install -c bioconda ribodetector
conda install pytorch torchvision torchaudio cpuonly -c pytorch
My code is as follows:
ribodetector_cpu -t 16 \
-l 150 \
-i fastq/SRR15852393_1.fastq.gz \
fastq/SRR15852393_2.fastq.gz \
-e rrna \
--chunk_size 256 \
-o out/fastq_orig_ribodetector/SRR15852393_1.ribodetector.fastq.gz \
out/fastq_orig_ribodetector/SRR15852393_2.ribodetector.fastq.gz
and I receive the following error:
2022-06-22 16:42:44 : INFO Using high MCC model file: /opt/anaconda3/envs/ribodetector/lib/python3.8/site-packages/ribodetector/data/ribodetector_600k_variable_len70_101_epoch47.onnx on CPU
2022-06-22 16:42:44 : INFO Classify reads with chunk size 256
2022-06-22 16:42:44 : INFO Writing output non-rRNA sequences into file: out/fastq_orig_ribodetector/SRR15852393_1.ribodetector.fastq.gz, out/fastq_orig_ribodetector/SRR15852393_2.ribodetector.fastq.gz
Traceback (most recent call last):
File "/opt/anaconda3/envs/ribodetector/bin/ribodetector_cpu", line 10, in <module>
sys.exit(main())
File "/opt/anaconda3/envs/ribodetector/lib/python3.8/site-packages/ribodetector/detect_cpu.py", line 746, in main
seq_pred.detect()
File "/opt/anaconda3/envs/ribodetector/lib/python3.8/site-packages/ribodetector/detect_cpu.py", line 526, in detect
self.run_with_chunks()
File "/opt/anaconda3/envs/ribodetector/lib/python3.8/site-packages/ribodetector/detect_cpu.py", line 354, in run_with_chunks
p.start()
File "/opt/anaconda3/envs/ribodetector/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/opt/anaconda3/envs/ribodetector/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/opt/anaconda3/envs/ribodetector/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/opt/anaconda3/envs/ribodetector/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/opt/anaconda3/envs/ribodetector/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/opt/anaconda3/envs/ribodetector/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/opt/anaconda3/envs/ribodetector/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle 'onnxruntime.capi.onnxruntime_pybind11_state.InferenceSession' object
Are there any additional dependancies for running on MacOS vs. Linux? Any tips would be appreciated!
Is your CPU Apple Silicon?
I have an Intel CPU, sorry for not including hardware specs in my initial post! Hardware includes a 2.5 GHz 14-core Intel Xeon W processor, 32 GB RAM, and a Radeon Pro Vega 56 8 GB GPU.
Thank you so much for reporting this issue. I am able to reproduce this on a MacBook pro with Intel CPU. At the first glance, it is a compatibility issue of ONNXruntime with Python multiprocessing on MacOS. I will take a closer look and hopefully fix it in the next release. But I would recommend running RiboDetector on a powerful Linux server. This will allow you analyze large scale datasets in a timely manner.
@dawnmy any resolution to this? I encounter the same issue and has no access to Linux server.
I will try to fix it this weekend.
@omicz @ywlim-sea Thank you for reporting this issue. The issue has been fixed in version 0.2.7! You can have a try.