Numpy error with 6.0b for some sequences
Closed this issue · 2 comments
Hi there!
Thanks for your great work.
I was testing out the update and came along an issue while running on the signalp5 benchmark set during the marginal conflict resolution step.
This sequence:
>A0R1E8|POSITIVE|LIPO|2
MTQNCVAPVAIIGMACRLPGAINSPQQLWEALLRGDDFVTEIPTGRWDAEEYYDPEPGVPGRSVSKWGAF
from https://services.healthtech.dtu.dk/services/SignalP-6.0/public_data/benchmark_set_sp5.fasta
appears to be the issue.
Running version 6.0b in "fast" mode with this sequence in both other and eukaryote organisms causes the following error.
$ signalp6 --output_dir test --format txt --organism euk --mode fast --fastafile test.fasta
/home/ubuntu/miniconda3/envs/sp6/lib/python3.6/site-packages/torch/nn/modules/module.py:1051: UserWarning: where received a uint8 condition tensor. This behavior is deprecated and will be removed in a future version of PyTorch. Use a boolean condition instead. (Triggered internally at /tmp/pip-req-build-1ky46svp/aten/src/ATen/native/TensorCompare.cpp:255.)
return forward_call(*input, **kwargs)
Predicting: 100%|| 1/1 [00:00<00:00, 1.53batch/s]
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/sp6/bin/signalp6", line 8, in <module>
sys.exit(predict())
File "/home/ubuntu/miniconda3/envs/sp6/lib/python3.6/site-packages/signalp/__init__.py", line 6, in predict
main()
File "/home/ubuntu/miniconda3/envs/sp6/lib/python3.6/site-packages/signalp/predict.py", line 235, in main
resolve_viterbi_marginal_conflicts(global_probs, marginal_probs, cleavage_sites, viterbi_paths)
File "/home/ubuntu/miniconda3/envs/sp6/lib/python3.6/site-packages/signalp/utils.py", line 254, in resolve_viterbi_marginal_conflicts
cleavage_sites[i] = sp_idx.max() +1
File "/home/ubuntu/miniconda3/envs/sp6/lib/python3.6/site-packages/numpy/core/_methods.py", line 39, in _amax
return umr_maximum(a, axis, None, out, keepdims, initial, where)
ValueError: zero-size array to reduction operation maximum which has no identity
This doesn't appear to be an issue with the previous version available for download.
Both have identical main dependency versions:
python 3.6.13
numpy 1.19.5
pytorch 1.9.1
tqdm 4.62.3
Thanks in advance,
Darcy
Hi Darcy, thanks a lot for raising this!
Turns out there was an issue in the conflict resolving function when processing Sec/SPII and Tat/SPII lipoproteins. I added logic to handle those as a separate case, using the predicted modified cysteine after the cleavage site to impute it when it's missing.
The online version is patched, I'll close the issue once the updated downloads go live.
Cool, thanks!