
error loading state_dict for TerminatorTagger

Hi, Thanks for developing this software, it sounds really useful for our research. I am trying to get it up and running and seem to be getting a weird error. I installed all of the prerequisites in a conda environment, then I tried running it on a cpu (it wouldnt recognize our GPU for some reason) with the include example data.

However, I get the following error:

./scripts/batter --fasta ./examples/S.aureus/GCF_000013425.1_ASM1342v1_genomic.fna --output ../staph.batter.out.bed --device cpu -rc -v
[2023-10-30 16:50:48,672] [tagging terminators] Initialize the model ...
[2023-10-30 16:50:48,770] [tagging terminators] Load model paramters from model/ ...
Traceback (most recent call last):
File "pathredacted/batter/batter/./scripts/batter", line 188, in
File "pathredacted/batter/batter/./scripts/batter", line 112, in main
File "pathredacted/opt/envs/batter/lib/python3.10/site-packages/torch/nn/modules/", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for TerminatorTagger:
Unexpected key(s) in state_dict: "encoder.embeddings.position_ids".

Any ideas or suggestions on what may be going wrong? Thanks!

Hi Alexandra, if you are using the latest transformers package, I think conda install -c conda-forge transformers==4.18.0 should fix the problem. It seems latter longformer versions have a different implementation for the "position_ids" variable and no longer keep it in the model state dict. Thanks for helping us improve our tool. (Also note that the inference on CPU can be slow and take some time to finish.)

Thanks! I tried that and unfortunately have a different error now:

./scripts/batter --fasta ./examples/S.aureus/GCF_000013425.1_ASM1342v1_genomic.fna --output ../staph.batter.out.bed --device cpu -rc -v
[2023-10-31 13:09:05,559] [tagging terminators] Initialize the model ...
[2023-10-31 13:09:05,683] [tagging terminators] Load model paramters from model/ ...
[2023-10-31 13:09:05,828] [tagging terminators] Will use cpu for inference ...
[2023-10-31 13:09:05,830] [tagging terminators] Load sequences from ./examples/S.aureus/GCF_000013425.1_ASM1342v1_genomic.fna ...
[2023-10-31 13:09:06,121] [tagging terminators] Intermediate result will be saved to ../staph.batter.out.bed.tmp ...
[2023-10-31 13:09:06,131] [tagging terminators] processing NC_007795.1 ...
/thefolderpath/batter/batter/scripts/ UserWarning: where received a uint8 condition tensor. This behavior is deprecated and will be removed in a future version of PyTorch. Use a boolean condition instead. (Triggered internally at /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1696859578619/work/aten/src/ATen/native/TensorCompare.cpp:493.)
score = torch.where(mask[i].unsqueeze(-1).unsqueeze(-1), next_score, score)
Traceback (most recent call last):
File "/thefolderpath//batter/batter/./scripts/batter", line 188, in
File "/thefolderpath/batter/batter/./scripts/batter", line 158, in main
inference(tagger, batched_tokens, batched_ivs, args.top_k, args.device) #, args.temperature)
File "/thefolderpath/batter/batter/./scripts/batter", line 82, in inference
tags, probs = tagging(batched_tokens, tagger, nbest, temperature)
File "/thefolderpath//batter/batter/./scripts/batter", line 23, in tagging
tags, scores = model.crf.decode(logits, attention_mask, nbest=nbest)
File "/thefolderpath/batter/batter/scripts/", line 124, in decode
return self._viterbi_decode_nbest(emissions, mask, nbest, pad_tag)
File "/thefolderpath//batter/batter/scripts/", line 362, in _viterbi_decode_nbest
next_score = next_score.view(batch_size, -1, self.num_tags)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

I installed the dependencies with:
mamba install -n batter transformers==4.18.0 ushuffle pytorch cudatoolkit pyfaidx==0.7.1 numpy lightgbm bedtools pandas

and the environment has these versions:
I figured it out. Once I pinned pytorch to 1.7.1 then it seems to work.

mamba install -n batter transformers==4.18.0 ushuffle pytorch==1.7.1 cudatoolkit pyfaidx==0.7.1 numpy lightgbm bedtools pandas

