snap-stanford/UCE

File potentially missing

y-doctor opened this issue · 3 comments

Traceback (most recent call last):
Wrote Shapes Dict
Traceback (most recent call last):
File "/tscc/nfs/home/ydoctor/datasets/UCE/eval_single_anndata.py", line 155, in
main(args, accelerator)
File "/tscc/nfs/home/ydoctor/datasets/UCE/eval_single_anndata.py", line 84, in main
processor.generate_idxs()
File "/tscc/nfs/home/ydoctor/datasets/UCE/evaluate.py", line 122, in generate_idxs
species_to_pe = get_species_to_pe(self.args.protein_embeddings_dir)
File "/tscc/nfs/home/ydoctor/datasets/UCE/data_proc/data_utils.py", line 263, in get_species_to_pe
species_to_pe = {
File "/tscc/nfs/home/ydoctor/datasets/UCE/data_proc/data_utils.py", line 264, in
species:torch.load(pe_dir) for species, pe_dir in embeddings_paths.items()
File "/opt/conda/lib/python3.10/site-packages/torch/serialization.py", line 771, in load
with _open_file_like(f, 'rb') as opened_file:
File "/opt/conda/lib/python3.10/site-packages/torch/serialization.py", line 270, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/opt/conda/lib/python3.10/site-packages/torch/serialization.py", line 251, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/dfs/project/cross-species/yanay/code/uce_code/UCE_public/model_files/protein_embeddings/Gallus_gallus.bGalGal1.mat.broiler.GRCg7b.pep.all.gene_symbol_to_embedding_ESM2.pt'

Also, here is the command I was running:
singularity exec --nv /tscc/nfs/home/ydoctor/containers/data_science_box.sif python /tscc/nfs/home/ydoctor/datasets/UCE/eval_single_anndata.py --adata_path /tscc/nfs/home/ydoctor/datasets/CM4AI/Perturb-seq_10_26_23_RNA_Only.h5ad --dir /tscc/nfs/home/ydoctor/datasets/CM4AI/UCE_Outputs/ --species human --model_loc /tscc/nfs/home/ydoctor/datasets/UCE/model_weights/UCE_params_33_layer.torch --batch_size 8 --nlayers 33

If you could point me towards what I might be able to do to fix this that would be great. Thanks so much!

I resolved it by commenting out these lines
image
Screen Shot 2023-12-21 at 12 01 39 PM

I think what happened was latest merge had something to do with testing out incorporating a new species (chicken) but the ESM embeddings are saved locally so it breaks, would maybe consider just commenting these lines out and the 'extra_species' lines above them

Yanay1 commented

I think you can also just remove the line in the CSV file for protein embeddings. I've removed it on the main branch now. Thanks for pointing this out!