interactivereport/cellxgene_VIP

Bug report : obs columns with nan cause VIP error

yyoshiaki opened this issue · 2 comments

Hi,

I noticed that when obs include nan, VIP cannot recognize cells appropriately as in the image attached.

messageImage_1637735322742

ex.

adata.obs['IR_VJ_1_c_call']

index
AAACCTGAGACAAGCC-1-0      IGKC
AAACCTGAGAGTCGGT-1-0      IGKC
AAACCTGAGCACACAG-1-0     NaN
AAACCTGAGGAATCGC-1-0     IGLC1
AAACCTGAGTGAAGAG-1-0      IGKC
                         ...  
TTTGTCAGTTAAGAAC-1-11    IGLC2
TTTGTCATCAAACCGT-1-11    IGLC2
TTTGTCATCAACACCA-1-11    IGLC2
TTTGTCATCCCAAGTA-1-11      NaN
TTTGTCATCCGAGCCA-1-11    IGLC2
Name: IR_VJ_1_c_call, Length: 78638, dtype: category
Categories (5, object): ['IGKC', 'IGLC1', 'IGLC2', 'IGLC3', 'IGLC7']

To replace nan into str, converting obs columns into str and again into category solved the error.

adata_cg = adata.raw.to_adata()
for c in adata_cg.obs.columns:
    if adata_cg.obs[c].isna().sum() > 0:
        adata_cg.obs[c] = adata_cg.obs[c].astype(str).astype("category")
adata_cg.write(results_file_cellxgene)

Though this would be a rare case, I reported it because the bug fix can improve cellxgene_VIP.

best,
Yoshi

I'm thinking this PR, #65, might solve your issue. But I guess will need the maintainer to confirm my changes are valid or not.

Hi @yyoshiaki, thanks for the report, and @michaeleekk for the proposed fix.
We currently don't allow "NaN" or "Null" in the obs categorical annotation, they have consequences on "Abbr. & Combine". Thus, we currently suggest that you change the "NaN" or "Null" to "NA" in your obs.
we will add a message if we detect the "NaN" or "Null" ('undefined'in js).

Please reopen this, if you still encounter a problem after change it to "NA".