How to load the base model in the fine-tuning task of KWS

Question

How to load the base model in the fine-tuning task of KWS

Closed this issue 4 months ago · 3 comments

I want to fine-tune the KWS task with a custom corpus based on the well-trained model. The language is English.

the well-trained model source：

git clone https://huggingface.co/yfyeung/icefall-asr-gigaspeech-zipformer-2023-10-17

project:

egs/gigaspeech/KWS

Execute the script:

./run.sh   stage=3

Report an error：

Traceback (most recent call last):
 File "/workspace/icefall/icefall/egs/gigaspeech/KWS/./zipformer/finetune.py", line 644, in <module>
   main()
 File "/workspace/icefall/icefall/egs/gigaspeech/KWS/./zipformer/finetune.py", line 638, in main
   run(rank=0, world_size=1, args=args)
 File "/workspace/icefall/icefall/egs/gigaspeech/KWS/./zipformer/finetune.py", line 464, in run
   sp.load(params.bpe_model)
 File "/opt/conda/lib/python3.10/site-packages/sentencepiece/__init__.py", line 961, in Load
   return self.LoadFromFile(model_file)
 File "/opt/conda/lib/python3.10/site-packages/sentencepiece/__init__.py", line 316, in LoadFromFile
   return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: could not parse ModelProto from icefall-asr-gigaspeech-zipformer-2023-10-17/data/lang_bpe_500/bpe.model

I didn't find where the mistake was，can you help me? @pkufool @JinZr

Answer 1 · 2024-06-12T13:47:58.000Z

does this file exist icefall-asr-gigaspeech-zipformer-2023-10-17/data/lang_bpe_500/bpe.model

Answer 2 · 2024-06-12T13:47:59.000Z

Could you post the file size of the bpe.model?

Answer 3 · 2024-06-12T14:01:22.000Z

Fixed in our wechat group by using

cd icefall-asr-gigaspeech-zipformer-2023-10-17/data/lang_bpe_500

git lfs pull --include "bpe.model"