snap-stanford/SATURN

Similar to KeyError: 'cell_type' #4

Closed this issue · 3 comments

Hello, Thanks for creating this tool.
I am having a similar error to a previous error reported here: #4
I have double checked, and made sure all of my h5ad objects have a .obs.cell_type column, and they do, but I still get the KeyError: 'cell_type'

` Using Device 0
Set seed to 0
Traceback (most recent call last):
File "/path/to/base.py", line 3805, in get_loc
return self._engine.get_loc(casted_key)
File "index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc
File "index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'cell_type'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/path/to/train-saturn.py", line 1064, in
trainer(args)
File "/path/to/train-saturn.py", line 441, in trainer
species_str.index = adata.obs[adata_label].index
File "/path/to/frame.py", line 4090, in getitem
indexer = self.columns.get_loc(key)
File "/path/to/base.py", line 3812, in get_loc
raise KeyError(key) from err
KeyError: 'cell_type' `
My command was:

python3 /path/to/train-saturn.py \ --in_data=/path/to/all_species_run.csv \ --in_label_col=cell_type --ref_label_col=cell_type \ --num_macrogenes=2000 --hv_genes=8000 \ --centroids_init_path=/path/to/saturn_results//all_species_centroids.pkl \ --score_adata --ct_map_path=/path/to/cell_type_map.csv \ --work_dir=/path/to/work_dir

Is there something else that could be causing this error?

Could you check that /path/to/all_species_run.csv has the correct adata paths?

I am sure the paths are correct.
I copied the paths from all_species_run.csv and tried the following for each of them:

import scanpy as sc
adata=sc.read_h5ad('path/to/adata.h5ad')
print(adata.obs.cell_type.head(20))

This worked and showed the cell_type of the first 20 cells for each path

My apologies, I found the mistake at my end, there was no issue with your code.
Thank you again for making this great package.