`IndexError: list index out of range` while running the cell annotation workbook
matchy233 opened this issue · 2 comments
I was trying to run the latest cell annotation workbook on Roche's sHPC platform using jupyter lab with the besca2.5.3 kernel provided. The notebook was expected to run without errors, but I encountered IndexError
in cell 46 (the cell number if you click "run all").
The error screenshot was attached below.
I've already identified the root cause of this error: it's related to the read_annotconfig
function in besca/besca/tl/sig/_annot.py.
I'm not sure since when but at least for pandas v2.0.2
, pd.read_csv
will replace all NaN
-like values (including "None"
) with NaN
when you read a csv/tsv file.
So the "None"
s in sigconfig
will be replaced by NaN
in the current implementation and thus will affect the building of levs
, resulting in the function returning an empty levsk
list. This consequently causes the index out of range error.
We could fix this by:
- Add
keep_default_na=False
toread_csv
- Add
na_filter=False
toread_csv
- Do not use
sigconfig["Parent"] == "None"
as the filtering criterion
Any of the fix is pretty easy so I can raise a PR for it after a dev review this issue.
Hi @matchy233, this sounds good to me. You can go ahead with your PR