gtonkinhill/panaroo

Sporadic KeyError exceptions raised during collapse_families

Opened this issue · 4 comments

I'm running panaroo 1.5.0 in a linux docker container installed via conda using the commands below. The panaroo command seems to randomly fail despite using the same exact inputs and arguments. My work flow manager retries the panaroo task when it fails and it succeeds on follow-up attempts.

I'm not sure why this is happening. Perhaps it's related to using multiple threads? Also, I've seen others suggest using the --remove-invalid-genes flag to fix KeyError errors elsewhere, but I struggle to imagine how that would explain the cause of these seemingly random failures.

Panaroo command:

panaroo \
    --clean-mode strict \
    -i gff_list.txt \
    -o panaroo_out \
    --threads 72

Installation:

RUN micromamba install -y -n base -c conda-forge -c bioconda -c defaults python=3.9 iqtree panaroo=1.5.0 && \
    micromamba clean -a -y

Exception Message:

collapse gene families...
Traceback (most recent call last):
  File "/data/bin/bin/panaroo", line 10, in <module>
    sys.exit(main())
  File "/data/bin/lib/python3.9/site-packages/panaroo/__main__.py", line 398, in main
    G, distances_bwtn_centroids, centroid_to_index = collapse_families(
  File "/data/bin/lib/python3.9/site-packages/panaroo/clean_network.py", line 141, in collapse_families
    seqid_to_index[sid] = centroid_to_index[seqid_to_centroid[sid]]
KeyError: '0_0_4912'

Exception context:

# keep track of centroids for each sequence. Need this to resolve clashes
seqid_to_index = {}
for node in G.nodes():
for sid in G.nodes[node]['seqIDs']:
if "refound" in sid:
seqid_to_index[sid] = centroid_to_index[G.nodes[node]
["longCentroidID"][1]]
else:
seqid_to_index[sid] = centroid_to_index[seqid_to_centroid[sid]]

Hi,

I just received a very similar error with panaroo/1.3.4 . I was also running with multiple threads. Have you found anything to fix it?

Message:

    G, distances_bwtn_centroids, centroid_to_index = collapse_families(
  File "/usr/local/lib/python3.10/site-packages/panaroo/clean_network.py", line 141, in collapse_families
    seqid_to_index[sid] = centroid_to_index[seqid_to_centroid[sid]]
KeyError: '6_5_63'

Hi both,

Thanks very much for pointing this out. I had hoped this issue was fixed in a previous release.
If either of you are able to create a consistently reproducible example that would be great as it is challenging to fix when it only occurs randomly.
I've got a few deadlines at the moment but am hoping to get back to Panaroo development next week.

Unfortunately, I have not been able to reproduce this error. However, I was able to confirm that the initial pre_filt_graph.gml file appears to be built correctly when the error occurs.

Therefore, the issue is likely within the collapse_families function. My current suspicion is that the error arises when the pwdist_edlib sub-function fails when many threads are in use. This function generates the centroid_to_index dictionary.

If this is the case, the error is unlikely to result in incorrect output. Instead, it will simply cause Panaroo to crash with a rather cryptic error message.

I will leave this issue open for now in case someone can provide a reproducible example. In the meantime, I recommend users encountering this issue re-run Panaroo with fewer threads.

Hello, I've encountered this error again, and I was wondering if, in the absence of a permanent fix, it would make sense to implement a try-except block to catch the KeyError exception. In this case, the script could retry the collapse_families method with two fewer threads than originally specified. This way, if the retry works, it could save me from having to re-run the entire Panaroo workflow from scratch.

It might also be useful to allow the user to specify the number of retry attempts. Logging the retries and any exceptions that occur during the process could help with troubleshooting.

Thank you for your consideration!