zktuong/dandelion

Bugs in 'pp.check_contigs'

Closed this issue · 1 comments

Description of the bug

When running 'pp.check_contigs' with default sets, I got the following error 'local variable "vdj_ccall_p_igm_count" referenced before assignment'. But the 'pp.filter_contigs' function works fine.

Minimal reproducible example

bcr_vdj, adata_BP = ddl.pp.check_contigs(bcr_vdj, adata_BP)

The error message produced by the code above

Preparing data: 267882it [01:44, 2568.34it/s]
Scanning for poor quality/ambiguous contigs:   0%|          | 1/220792 [00:00<23:15, 158.23it/s]
---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Cell In[53], line 1
----> 1 bcr_vdj, adata_BP = ddl.pp.check_contigs(bcr_vdj, adata_BP)
      2 #>local variable 'vdj_ccall_p_igm_count' referenced before assignment

File ~/.conda/envs/dandelion/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py:5199, in check_contigs(data, adata, productive_only, library_type, umi_foldchange_cutoff, filter_missing, filter_extra, save, verbose, **kwargs)
   5197     adata_ = ad.AnnData(obs=obs)
   5198     adata_.obs["has_contig"] = "True"
-> 5199 contig_status = MarkAmbiguousContigs(dat, umi_foldchange_cutoff, verbose)
   5201 ambigous = contig_status.ambiguous_contigs.copy()
   5202 extra = contig_status.extra_contigs.copy()

File ~/.conda/envs/dandelion/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py:5387, in MarkAmbiguousContigs.__init__(self, data, umi_foldchange_cutoff, verbose)
   5376     vdj_ccall_p_igm_count = dict(
   5377         data1[data1["c_call"] == "IGHM"][
   5378             "umi_count"
   5379         ]
   5380     )
   5381     vdj_ccall_p_igd_count = dict(
   5382         data1[data1["c_call"] == "IGHD"][
   5383             "umi_count"
   5384         ]
   5385     )
-> 5387 if len(vdj_ccall_p_igm_count) > 1:
   5388     (
   5389         keep_igm,
   5390         extra_igm,
   (...)
   5394         umi_foldchange_cutoff,
   5395     )
   5396 else:

UnboundLocalError: local variable 'vdj_ccall_p_igm_count' referenced before assignment

OS information

No response

Version information

dandelion==0.3.5 pandas==2.2.0 numpy==1.26.4 matplotlib==3.8.4 networkx==3.1 scipy==1.13.0

Additional context

No response

thanks for the bug report @chf8991. can you try and install the branch at #370 and see if it helps?