facebookresearch/fairseq

decoding in Hubert: RecursionError: maximum recursion depth exceeded in comparison

Charlottehoo opened this issue · 0 comments

❓ Questions and Help

I am now working on Hubert model
I have pretrained a new model from Youtube dataset and finetuned the model using some animal sound files. The labels for each file are animal names, so I have created new dictionary and ltr, wrd. I set ltr and wrd to be same, the animal name labels.

I run 50 iterations for pretrain and 50 iterations for finetune

but when I decode test dataset using the checkpoint_best.pt , I have trouble with the following error.

What is your question?

File "/export/home/chu/fairseq/fairseq/models/hubert/hubert_asr.py", line 374, in init model = pretrain_task.build_model(w2v_args.model, from_checkpoint=True)
File "/export/home/chu/fairseq/fairseq/tasks/fairseq_task.py", line 355, in build_model model = models.build_model(cfg, self, from_checkpoint)
File "/export/home/chu/fairseq/fairseq/models/init.py", line 106, in build_model return model.build_model(cfg, task)
File "/export/home/chu/fairseq/fairseq/models/hubert/hubert_asr.py", line 170, in build_model w2v_encoder = HubertEncoder(cfg, task)
File "/export/home/chu/fairseq/fairseq/models/hubert/hubert_asr.py", line 349, in init state = checkpoint_utils.load_checkpoint_to_cpu(cfg.w2v_path, arg_overrides)
File "/export/home/chu/fairseq/fairseq/checkpoint_utils.py", line 358, in load_checkpoint_to_cpu state["cfg"] = OmegaConf.create(state["cfg"])
RecursionError: maximum recursion depth exceeded in comparison full_key: job_logging_cfg.formatters.simple.format
reference_type=Any object_type=dict
full_key: job_logging_cfg.formatters.simple reference_type=Any
object_type=dict full_key: job_logging_cfg.formatters
reference_type=Any object_type=dict
full_key: job_logging_cfg reference_type=Optional[Dict[Union[str, Enum], Any]]
object_type=dict

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Code

python fairseq_cli/hydra_train.py
--config-dir ~/fairseq/examples/hubert/config/finetune
--config-name base_10h
task.data=/.../data/finetune_data
task.label_dir=/.../trans/finetune_trans
model.w2v_path=/.../fairseq/None/checkpoints/checkpoint_best.pt
dataset.skip_invalid_size_inputs_valid_test=true
checkpoint.save_interval=1
checkpoint.reset_optimizer=true

python examples/speech_recognition/new/infer.py
--config-dir ~/.../config/decode
--config-name infer_viterbi
task.data=/.../data
task.normalize=false
dataset.gen_subset=test
common_eval.path=/.../checkpoints/checkpoint_best.pt

####Config file

defaults:

  • model: null

hydra:
run:
dir: ${common_eval.results_path}/viterbi
sweep:
dir: ${common_eval.results_path}
subdir: viterbi

task:
_name: hubert_pretraining
single_target: true
fine_tuning: true
data: ???
normalize: ???

decoding:
type: viterbi
unique_wer_file: true
common_eval:
results_path: ???
path: ???
post_process: letter
dataset:
max_tokens: 1100000
gen_subset: ???

What's your environment?

  • fairseq Version ( 0.12.2):
  • PyTorch Version (2.2.2)
  • OS (Linux):
  • How you installed fairseq (pip):
  • Build command you used (pip install --no-build-isolation --editable ./):
  • Python version: 3.9.19