instadeepai/InstaNovo

InstaNovov on Windows

Closed this issue · 3 comments

Neither the repository or the publication mention a required OS. As far as I can tell it will not run on Windows OS, that's fine but ideally the documentation should say that.

I tried installing on Windows 11 but the installation of the InstaNovo package failed when there were calls to the Linux command 'lscpu':

  ←[93m [WARNING] ←[0m cpu_adam requires the 'lscpu' command, but it does not exist!
  ←[93m [WARNING] ←[0m cpu_adam attempted to query 'lscpu' after failing to use py-cpuinfo to detect the CPU architecture. 'lscpu' does not appear to exist on your system, will fall back to use -march=native and non-vectorized execution.

...
←[93m [WARNING] ←[0m cpu_adagrad requires the 'lscpu' command, but it does not exist!
←[93m [WARNING] ←[0m cpu_adagrad attempted to query 'lscpu' after failing to use py-cpuinfo to detect the CPU architecture. 'lscpu' does not appear to exist on your system, will fall back to use -march=native and non-vectorized execution.

On the same computer I was able to install into Windows Subsystem for Linux running Ubuntu. I ran a few tests of torch and the GPU looked to be accessible. When trying to run a InstanNovo the search initializes but then hangs at the actual search where it might use the GPU:
(instanovo) richardj@EthicsGradient:~/InstaNovo$ python -m instanovo.transformer.predict 20230606_MK_E1200_MVL_FAIMS_2CV_15cm_70min_0000_DDA_200ng_QC1_Hela_QC.ipc instanovo.pt --denovo --output_path instanovo.HeLaTest.csv
INFO:root:Initializing inference.
INFO:root:Loading data from 20230606_MK_E1200_MVL_FAIMS_2CV_15cm_70min_0000_DDA_200ng_QC1_Hela_QC.ipc
INFO:root:Data loaded, evaluating 100.0%, 46409 samples in total.
INFO:root:Knapsack path missing or not specified, generating...
INFO:root:Scaling masses.
INFO:root:Initializing chart.
INFO:root:Performing search.
0%| | 0/726 [00:00<?, ?it/s]

I left the search running overnight and no progress was made. Here is a repeat search where I left for a few minutes and then stopped with control C:
0%| | 0/726 [08:09<?, ?it/s]
Traceback (most recent call last):
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/site-packages/instanovo/transformer/predict.py", line 192, in
main()
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/site-packages/instanovo/transformer/predict.py", line 175, in main
get_preds(data_path, model, config, denovo, output_path, knapsack_path)
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/site-packages/instanovo/transformer/predict.py", line 97, in get_preds
for _, batch in tqdm(enumerate(dl), total=len(dl)):
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/site-packages/tqdm/std.py", line 1178, in iter
for obj in iterable:
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 628, in next
data = self._next_data()
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1316, in _next_data
idx, data = self._get_data()
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1282, in _get_data
success, data = self._try_get_data()
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1120, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/multiprocessing/queues.py", line 107, in get
if not self._poll(timeout):
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/multiprocessing/connection.py", line 257, in poll
return self._poll(timeout)
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/multiprocessing/connection.py", line 424, in _poll
r = wait([self], timeout)
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/multiprocessing/connection.py", line 931, in wait
ready = selector.select(timeout)
File "/home/richardj/miniconda3/envs/instanovo/lib/python3.8/selectors.py", line 415, in select
fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt

Hi, we don't officially support Windows, but you should be able to run InstaNovo anyway.

For Windows, I would manually install the dependencies one at a time to make sure they all work:

pip install instanovo --no-deps

Then install the dependencies listed in the requirements.txt. I would not install DeepSpeed as it's not required for inference and doesn't support Windows training anyway. We will update the requirements to reflect this.

For running InstaNovo on Windows Subsystem for Linux, I assume this issue is caused by windows combined with threading in PyTorch dataloaders. In my experience Windows causes PyTorch dataloaders to hang when n_workers>0. When calling the instanovo.transformers.predict, add a flag for --n_workers 0 to overcome this. Hope this helps!

We will update the documentation to make this clearer.

Hello @lostculture,

Have you been able to run InstaNovo on Windows or WSL?

I used Windows Subsystem for Linux and adding the flag -n_workers 0 resolved the issue. Thank you.