Using ASR to obtain syllables, matching text from lyrics, and generating JSON for Minlabel preloading.
Note: Only label "AP" in the "SP" label of the original tg file, and accuracy is based on the original annotation file.
-
If using SOFA to generate textgrid annotations
python infer.py ... --ap_detector NoneAPDetector
An additional "--ap_detector NoneAPDetector" needs to be added to generate a tg file without AP annotations.
-
Download model
model_folder
├── config.yaml
└── model_ckpt_steps_7000.ckpt
-
Generate AP labels by running textgrid-add-ap
python textgrid-add-ap.py --ckpt_path model_folder/xx.ckpt --wav_dir wav_dir --tg_dir tg_dir --tg_out_dir tg_out_dir Option: --ckpt_path str Path to the checkpoint --wav_dir str Wav file folder (*.wav). --tg_dir str Textgrid files (*.TextGrid). --tg_out_dir str Output path of tg file after labeling AP. --ap_threshold float default: 0.4 Respiratory probability recognition threshold. (Option) --ap_dur float default: 0.08 The shortest duration of breathing, discarded below this threshold, in seconds. (Option) --sp_dur float default: 0.1 SP fragments below this threshold will be adsorbed onto adjacent AP, in seconds. (Option)