AnnData eval issue: cannot unpack non-iterable NoneType object
viriditax opened this issue · 15 comments
Was trying a simple test locally on M2 MacBook, call:
python eval_single_anndata.py --adata_path chicken_heart.h5ad --dir res --species chicken
Using sample 4 layer model
chicken_heart.h5ad ERROR
Traceback (most recent call last):
File "/scratch/UCE/eval_single_anndata.py", line 155, in
main(args, accelerator)
File "/scratch/UCE/eval_single_anndata.py", line 83, in main
processor.preprocess_anndata()
File "/scratch/UCE/evaluate.py", line 93, in preprocess_anndata
self.adata, num_cells, num_genes =
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: cannot unpack non-iterable NoneType object
This is after successful download of model:
Downloading ./model_files/species_chrom.csv from https://figshare.com/ndownloader/files/42706558 ...
100%|█████████████████████████████████████| 4.10M/4.10M [00:00<00:00, 4.27MiB/s]
Downloading ./model_files/species_offsets.pkl from https://figshare.com/ndownloader/files/42706555 ...
100%|█████████████████████████████████████████| 139/139 [00:00<00:00, 22.3kiB/s]
Downloading resmodel_files/protein_embeddings.tar.gz from https://figshare.com/ndownloader/files/42715213 ...
100%|█████████████████████████████████████| 2.74G/2.74G [02:19<00:00, 19.7MiB/s]
Done!
Downloading ./model_files/all_tokens.torch from https://figshare.com/ndownloader/files/42706585 ...
100%|█████████████████████████████████████| 2.98G/2.98G [02:31<00:00, 19.7MiB/s]
Using sample 4 layer model
Downloading ./model_files/4layer_model.torch from https://figshare.com/ndownloader/files/42706576 ...
100%|█████████████████████████████████████| 3.40G/3.40G [02:57<00:00, 19.2MiB/s]
Same issue with some previously prepared AnnData objects from human samples with species set to human.
What does the command look like when you run using a human dataset?
The datasets uploaded are datasets with X_uce already filled.
I've uploaded a notebook that walks through how to embed new species like Chicken:
https://github.com/snap-stanford/UCE/blob/main/data_proc/Create%20New%20Species%20Files.ipynb
It might be better to try one of the other species like human first.
python eval_single_anndata.py --adata_path scanpyscenic.h5ad --dir res --species human
which has these attributes:
AnnData object with n_obs × n_vars = 8806 × 32847
obs: 'orig.ident', 'nCount_originalexp', 'nFeature_originalexp', 'sample', 'patient_id', 'anatomical_location', 'cell_type', 'cell_subtype', 'nCount_RNA', 'nFeature_RNA', 'percent.mt', 'sizeFactor', 'originalexp_snn_res.0.5', 'seurat_clusters', 'ident'
var: 'features'
obsm: 'X_PCA', 'X_UMAP'
What is the terminal output when you run that?
This can happen if there was an issue processing the anndata.
Here it is:
$ python eval_single_anndata.py --adata_path chicken_heart.h5ad --dir res --species chicken
Using sample 4 layer model
**********************************
***********chicken_heart.h5ad ERROR***********
**********************************
Traceback (most recent call last):
File "~/UCE/eval_single_anndata.py", line 155, in <module>
main(args, accelerator)
File "~/UCE/eval_single_anndata.py", line 83, in main
processor.preprocess_anndata()
File "~/UCE/evaluate.py", line 93, in preprocess_anndata
self.adata, num_cells, num_genes = \
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: cannot unpack non-iterable NoneType object
Same error for each AnnData object.
What is the terminal output when you try to embed a human dataset?
Tried "human_lung_disease-001.h5ad" and got the same error. Anndata package version is 0.10.3.
In case it's helpful:
accelerate 0.25.0 pypi_0 pypi accelerator 2023.11.3.dev1 pypi_0 pypi anndata 0.10.3 pypi_0 pypi array-api-compat 1.4 pypi_0 pypi bottle 0.12.25 pypi_0 pypi bzip2 1.0.8 h1de35cc_0 ca-certificates 2023.08.22 hecd8cb5_0 certifi 2023.11.17 pypi_0 pypi charset-normalizer 3.3.2 pypi_0 pypi contourpy 1.2.0 pypi_0 pypi cycler 0.12.1 pypi_0 pypi filelock 3.13.1 pypi_0 pypi fonttools 4.45.1 pypi_0 pypi fsspec 2023.10.0 pypi_0 pypi h5py 3.10.0 pypi_0 pypi huggingface-hub 0.19.4 pypi_0 pypi idna 3.6 pypi_0 pypi jinja2 3.1.2 pypi_0 pypi joblib 1.3.2 pypi_0 pypi kiwisolver 1.4.5 pypi_0 pypi libffi 3.4.4 hecd8cb5_0 llvmlite 0.41.1 pypi_0 pypi markupsafe 2.1.3 pypi_0 pypi matplotlib 3.8.2 pypi_0 pypi mpmath 1.3.0 pypi_0 pypi natsort 8.4.0 pypi_0 pypi ncurses 6.4 hcec6c5f_0 networkx 3.2.1 pypi_0 pypi numba 0.58.1 pypi_0 pypi numpy 1.26.2 pypi_0 pypi openssl 3.0.12 hca72f7f_0 packaging 23.2 pypi_0 pypi pandas 2.1.3 pypi_0 pypi patsy 0.5.4 pypi_0 pypi pillow 10.1.0 pypi_0 pypi pip 23.3.1 py311hecd8cb5_0 psutil 5.9.6 pypi_0 pypi pynndescent 0.5.11 pypi_0 pypi pyparsing 3.1.1 pypi_0 pypi python 3.11.5 hf27a42d_0 python-dateutil 2.8.2 pypi_0 pypi pytz 2023.3.post1 pypi_0 pypi pyyaml 6.0.1 pypi_0 pypi readline 8.2 hca72f7f_0 requests 2.31.0 pypi_0 pypi safetensors 0.4.1 pypi_0 pypi scanpy 1.9.6 pypi_0 pypi scikit-learn 1.3.2 pypi_0 pypi scipy 1.11.4 pypi_0 pypi seaborn 0.12.2 pypi_0 pypi session-info 1.0.0 pypi_0 pypi setproctitle 1.3.3 pypi_0 pypi setuptools 68.0.0 py311hecd8cb5_0 six 1.16.0 pypi_0 pypi sqlite 3.41.2 h6c40b1e_0 statsmodels 0.14.0 pypi_0 pypi stdlib-list 0.10.0 pypi_0 pypi sympy 1.12 pypi_0 pypi threadpoolctl 3.2.0 pypi_0 pypi tk 8.6.12 h5d9f67b_0 torch 2.1.1 pypi_0 pypi tqdm 4.66.1 pypi_0 pypi typing-extensions 4.8.0 pypi_0 pypi tzdata 2023.3 pypi_0 pypi umap-learn 0.5.5 pypi_0 pypi urllib3 1.26.6 pypi_0 pypi waitress 2.1.2 pypi_0 pypi wheel 0.41.2 py311hecd8cb5_0 xz 5.4.2 h6c40b1e_0 zlib 1.2.13 h4dc903c_0
Can you please post the full terminal output? There should be terminal output before the error. What are the gene names in .var_names? Are they gene names or ensembl IDs?
Are you able to run the example anndata?
Did you also just try python eval_single_anndata.py
with no additional arguments? That downloads and runs the default h5ad and would establish if it's an environment issue or something else
@yhr91 running python eval_single_anndata.py worked great, it pulled down the 10k_pbmcs_proc, loaded model ./model_files/4layer_model.torch, and wrote new Anndata output to ./10k_pbmcs_proc_uce_adata.h5ad.
Running same with chicken_heart.h5ad or one of my anndata objects (.var_names are in gene names, not ENSEMBL IDs) returns the "TypeError: cannot unpack non-iterable NoneType object". Full error follows:
(UCE) $ python eval_single_anndata.py --adata_path chicken_heart.h5ad --dir res --species chicken
Using sample 4 layer model
**********************************
***********chicken_heart.h5ad ERROR***********
**********************************
Traceback (most recent call last):
File "~/scratch/UCE/eval_single_anndata.py", line 155, in <module>
main(args, accelerator)
File "~/scratch/UCE/eval_single_anndata.py", line 83, in main
processor.preprocess_anndata()
File "~/scratch/UCE/evaluate.py", line 93, in preprocess_anndata
self.adata, num_cells, num_genes = \
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: cannot unpack non-iterable NoneType object
anndata is 0.10.3 from pypi_0 build
Thanks for pointing this out. We made some updates to the code for running cross species data
If you follow the instruction in this notebook it should get everything set up correctly to run the script.
We're working on simplifying the interface a bit more for the cross species setting
Thank you for the update but I'm getting the same eval_single_anndata.py error with being unable to unpack the object as mentioned above. The goal was to use this on a human dataset where I've verified that the AnnData works in scanpy and other tools, confirmed gene names are not ENSEMBL IDs, etc. I tried creating a fresh conda environment with the 4 layer model newly pulled down and hit the same error as above.
Thanks, can you share the exact command you are using for running evaluation on the human dataset
Could you please try pulling the latest version of the repo and running the default command again? Thanks!
I tried it again with a fresh conda env Python 3.11 and cloning the git repo but hit the following error on trying to run the pbmc10k example. Input command: python eval_single_anndata.py
Error obtained:
Using sample AnnData: 10k pbmcs dataset
Downloading ./data/10k_pbmcs_proc.h5ad from https://figshare.com/ndownloader/files/42706966 ...
100%|█████████████████████████████████████| 85.6M/85.6M [00:05<00:00, 16.8MiB/s]
Using sample 4 layer model
Downloading ./model_files/4layer_model.torch from https://figshare.com/ndownloader/files/42706576 ...
100%|█████████████████████████████████████| 3.40G/3.40G [02:48<00:00, 20.2MiB/s]
Proccessing 10k_pbmcs_proc
8029.0
10k_pbmcs_proc (11990, 10809)
Wrote Shapes Dict
10809
Max Code: 613
Traceback (most recent call last):
File "/Users/no/scratch/UCE/UCE/eval_single_anndata.py", line 155, in <module>
main(args, accelerator)
File "/Users/no/scratch/UCE/UCE/eval_single_anndata.py", line 85, in main
processor.run_evaluation()
File "/Users/no/scratch/UCE/UCE/evaluate.py", line 145, in run_evaluation
run_eval(self.adata, self.name, self.pe_idx_path, self.chroms_path,
File "/Users/no/scratch/UCE/UCE/evaluate.py", line 206, in run_eval
all_pe = get_ESM2_embeddings(args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/no/scratch/UCE/UCE/evaluate.py", line 151, in get_ESM2_embeddings
all_pe = torch.load(args.token_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/no/miniconda3/envs/UCE2/lib/python3.11/site-packages/torch/serialization.py", line 993, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/no/miniconda3/envs/UCE2/lib/python3.11/site-packages/torch/serialization.py", line 447, in __init__
super().__init__(torch._C.PyTorchFileReader(name_or_buffer))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
I'm not sure exactly what this issue might be. It could be that that file did not unzip or download properly? Maybe try downloading it again?