zktuong/dandelion

Singularity - C call correction local variable referenced before assignment

Closed this issue · 4 comments

Description of the bug

I'm preprocessing some samples using the Singularity container & encountering an error during C call correction that causes the script to exit. It looks like it's trying to call a variable (tmp2) that hasn't been defined yet.

Minimal reproducible example

singularity run -B "$PWD" ../sc-dandelion_latest.sif dandelion-preprocess \
    --filter_to_high_confidence \
    --file_prefix "all" \
    --meta d14__meta.csv

command line parameters:
: --------------------------------------------------------------
    --meta = d14__meta.csv
    --chain = ig
    --org = human
    --file_prefix = all
    --sep = _
    --flavour = strict
    --skip_format_header = False
    --filter_to_high_confidence = True
    --keep_trailing_hyphen_number = False
    --skip_reassign_dj = False
    --skip_correct_c = False
    --clean_output = False
: --------------------------------------------------------------

The error message produced by the code above

Traceback (most recent call last):
  File "/share/dandelion_preprocess.py", line 338, in <module>
    main()
  File "/share/dandelion_preprocess.py", line 305, in main
    ddl.pp.assign_isotypes(
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py", line 892, in assign_isotypes
    assign_isotype(
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py", line 710, in assign_isotype
    dat_10x = read_10x_vdj(_10xfile)
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/utilities/_io.py", line 650, in read_10x_vdj
    return Dandelion(res)
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/utilities/_core.py", line 138, in __init__
    self.update_metadata(**kwargs)
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/utilities/_core.py", line 910, in update_metadata
    initialize_metadata(self, cols, clonekey, collapse_alleles)
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/utilities/_core.py", line 2088, in initialize_metadata
    tmp_metadata["locus_status"] = format_locus(tmp_metadata)
  File "/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/utilities/_utilities.py", line 846, in format_locus
    if any(tmp == "ambiguous" for tmp in [tmp1, tmp2]):
UnboundLocalError: local variable 'tmp2' referenced before assignment

OS information

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

Version information

Software versions:
dandelion==0.3.4.dev5
pandas==2.1.1
numpy==1.24.4
matplotlib==3.8.0
networkx==3.1
scipy==1.11.3

Additional context

No response

thanks @chrish935 for finding this bug!

I've added a fix at #325 which hopefully solves this. i will try and build a singularity image and ask you to try it and see if it corrects your error.

can you try with:

singularity pull library://kt16/default/sc-dandelion:dev

That looks like it fixed the problem, but it also started to complain a lot when quantifying mutations, which I don't recall it doing before (see below; error was printed to the console twice per sample). It doesn't seem to cause any real issues though as the resultant .tsv files all have the mu_count & mu_freq columns.

Code:

singularity run -B "$PWD" ../sc-dandelion_dev.sif dandelion-preprocess \
    --filter_to_high_confidence \
    --file_prefix "all" \
    --meta d14__meta.csv 

Error:

finished: saving DataFrame at d14_1/dandelion/all_contig_dandelion.tsv
 (0:00:28)
Quantifying mutations
R[write to console]: Error: package or namespace load failed forshazam’:
 .onAttach failed in attachNamespace() for 'shazam', details:
  call: file(con, "r")
  error: cannot open the connection

 finished: saving DataFrame at d14_1/dandelion/all_contig_dandelion.tsv
 (0:00:08)
Quantifying mutations
R[write to console]: Error: package or namespace load failed forshazam’:
 .onAttach failed in attachNamespace() for 'shazam', details:
  call: file(con, "r")
  error: cannot open the connection

hmm ok thanks for letting me know!