RuntimeError: Class values must be smaller than num_classes. | protein_mpnn_utils.py & mask_size issue?
emilyrkang opened this issue · 0 comments
I'm getting the following RuntimeError when trying to run ProteinMPNN on a windows machine with Python 3.7. The method I'm using works with the example 6 inputs, but when I try to use my own protein structure 4rjj, I get the runtime error: RuntimeError: Class values must be smaller than num_classes. I've tried using the biological assembly downloaded directly from the pdb and removing the ligands and all non "ATOM" lines from the structure but I still get this error message.
The following command works, but I would like to model it as a homooligomer and fix some residues.
py protein_mpnn_run.py --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --pdb_path 4rjj.pdb --pdb_path_chains "A B C D" --out_folder "C:\ProteinMPNN\myoutputs\4rjj" --num_seq_per_target 20 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'
Also, when I use the helper scripts, I need to remove the path ("C:\ProteinMPNN\my_input_PDBS\" in the text below) from the jsonl files or I get the following error message: OSError: [Errno 22] Invalid argument: 'my_outputs_directory//seqs/C:\ProteinMPNN\my_input_PDBS\4rjj.fa'
Here is my complete output:
C:\ProteinMPNN> py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\outputs\example_6_outputs\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\outputs\example_6_outputs\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'
chain_id_jsonl is NOT loaded
fixed_positions_jsonl is NOT loaded
pssm_jsonl is NOT loaded
omit_AA_jsonl is NOT loaded
bias_AA_jsonl is NOT loaded
bias by residue dictionary is not loaded, or not provided
discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}
Number of edges: 48
Training noise level: 0.2A
Generating sequences for: 6EHB
12 sequences of length 960 generated in 75.8997 seconds
Generating sequences for: 4GYT
12 sequences of length 354 generated in 52.1981 seconds
C:\ProteinMPNN>py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\myparsedfilesetc\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\myparsedfilesetc\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'
chain_id_jsonl is NOT loaded
fixed_positions_jsonl is NOT loaded
pssm_jsonl is NOT loaded
omit_AA_jsonl is NOT loaded
bias_AA_jsonl is NOT loaded
bias by residue dictionary is not loaded, or not provided
discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}
Number of edges: 48
Training noise level: 0.2A
Generating sequences for: 4rjj
Traceback (most recent call last):
File "protein_mpnn_run.py", line 469, in
main(args)
File "protein_mpnn_run.py", line 331, in main
sample_dict = model.tied_sample(X, randn_2, S, chain_M, chain_encoding_all, residue_idx, mask=mask, temperature=temp, omit_AAs_np=omit_AAs_np, bias_AAs_np=bias_AAs_np, chain_M_pos=chain_M_pos, omit_AA_mask=omit_AA_mask, pssm_coef=pssm_coef, pssm_bias=pssm_bias, pssm_multi=args.pssm_multi, pssm_log_odds_flag=bool(args.pssm_log_odds_flag), pssm_log_odds_mask=pssm_log_odds_mask, pssm_bias_flag=bool(args.pssm_bias_flag), tied_pos=tied_pos_list_of_lists_list[0], tied_beta=tied_beta, bias_by_res=bias_by_res_all)
File "C:\ProteinMPNN\protein_mpnn_utils.py", line 1218, in tied_sample
permutation_matrix_reverse = torch.nn.functional.one_hot(decoding_order, num_classes=mask_size).float()
RuntimeError: Class values must be smaller than num_classes.
C:\ProteinMPNN> py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\myparsedfilesetc\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\myparsedfilesetc\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'
chain_id_jsonl is NOT loaded
fixed_positions_jsonl is NOT loaded
pssm_jsonl is NOT loaded
omit_AA_jsonl is NOT loaded
bias_AA_jsonl is NOT loaded
bias by residue dictionary is not loaded, or not provided
discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}
Number of edges: 48
Training noise level: 0.2A
Generating sequences for: C:\ProteinMPNN\my_input_PDBS\4rjj
Traceback (most recent call last):
File "protein_mpnn_run.py", line 469, in
main(args)
File "protein_mpnn_run.py", line 323, in main
with open(ali_file, 'w') as f:
OSError: [Errno 22] Invalid argument: 'my_outputs_directory//seqs/C:\ProteinMPNN\my_input_PDBS\4rjj.fa'
C:\ProteinMPNN> py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\myparsedfilesetc\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\myparsedfilesetc\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'
chain_id_jsonl is NOT loaded
fixed_positions_jsonl is NOT loaded
pssm_jsonl is NOT loaded
omit_AA_jsonl is NOT loaded
bias_AA_jsonl is NOT loaded
bias by residue dictionary is not loaded, or not provided
discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}
Number of edges: 48
Training noise level: 0.2A
Generating sequences for: 4rjj
Traceback (most recent call last):
File "protein_mpnn_run.py", line 469, in
main(args)
File "protein_mpnn_run.py", line 331, in main
sample_dict = model.tied_sample(X, randn_2, S, chain_M, chain_encoding_all, residue_idx, mask=mask, temperature=temp, omit_AAs_np=omit_AAs_np, bias_AAs_np=bias_AAs_np, chain_M_pos=chain_M_pos, omit_AA_mask=omit_AA_mask, pssm_coef=pssm_coef, pssm_bias=pssm_bias, pssm_multi=args.pssm_multi, pssm_log_odds_flag=bool(args.pssm_log_odds_flag), pssm_log_odds_mask=pssm_log_odds_mask, pssm_bias_flag=bool(args.pssm_bias_flag), tied_pos=tied_pos_list_of_lists_list[0], tied_beta=tied_beta, bias_by_res=bias_by_res_all)
File "C:\ProteinMPNN\protein_mpnn_utils.py", line 1218, in tied_sample
permutation_matrix_reverse = torch.nn.functional.one_hot(decoding_order, num_classes=mask_size).float()
RuntimeError: Class values must be smaller than num_classes.