Issues
- 1
请问我们在espnet/egs2/aishell/asr1/下使用,报TypeError: wav2vec2_custom() missing 1 required positional argument: 'ckpt'错误,怎么解决,非常感谢!!!
#46 opened by MELABIPCAS - 4
如何获得1024维特征的离散id
#47 opened by wcr369 - 1
如何将预训练的权重转换成huggingface格式?
#54 opened by CodeMrSheep - 0
采样率是多少啊?
#53 opened by sunjian2015 - 2
- 1
音频fps如何调整为25
#52 opened by tailangjun - 0
- 1
关于该项目的bibtex格式引用
#50 opened by mixxs - 1
如何提取音频特征
#49 opened by tailangjun - 3
Error
#48 opened by ChengsongLu - 1
fairseq和huggingface输出结果不同
#45 opened by hao-qiang - 0
.
#44 opened by Bingtai1015 - 2
可以提取采样率为22050的音频的特征吗?
#43 opened by Bingtai1015 - 11
- 1
请问该预训练模型们的语音的采样率是多少呢?
#40 opened by ywh-my - 4
用CTC直接微调效果非常差
#39 opened by zyh3826 - 2
采用预训练模型提取语音特征,怎么处理长语音,直接切割或滑窗处理?
#23 opened by Owen1234560 - 1
Add WavLM
#42 opened by Blakey-Gavin - 0
k-means参数的读取
#41 opened by jidanhuang - 18
请问还传ESPnet的训练代码吗?
#8 opened by qixing-ai - 1
这个可以用于speaker-diarization任务吗
#38 opened by luomingjun2023 - 1
能否使用预训练模型同时更改参数?
#37 opened by LwLiu-2012 - 1
可以同时提取中英文语音的特征吗
#36 opened by milely - 0
hubert特征,用的是哪层的特征啊,还是哪些层的特征进行了加权和?比例是多少
#33 opened by yangsuxia - 0
你好请问large的特征聚类的时候使用了百分之多少的特征?10%的话需要内存多大的机器?
#35 opened by manmushanhe - 0
如何获得最后的unit?
#34 opened by mikesun4096 - 1
请问如何使用huggingface代码finetune
#28 opened by Yonnie1331 - 1
求一个能够输出最终文字的代码案例
#31 opened by moresun - 2
- 12
- 0
Problem about time shape
#30 opened by huutuongtu - 0
请问hubert模型训练时的batch_size大小是多少
#29 opened by dancinghui - 5
最终输出是768维还是1024维呢?
#26 opened by ZiqiaoPeng - 3
可以用作特征的是哪个字段里面的值
#12 opened by kejom-ou - 0
请问最长能处理多长的语音?
#27 opened by ddlBoJack - 2
请问预训练好模型之后提取音频特征时加权求和的具体做法是什么?
#19 opened by zdaaaaa - 1
- 1
请问如何用 fairseq 训练 wenetspeech
#25 opened by panpan-wu - 3
HuBERT模型对应的kmeans模型
#7 opened by ziyichen-paii - 0
- 15
ASR finetune收敛速度问题
#11 opened by qinyuenlp - 0
Fine-tune with my own dataset, wer is 1
#22 opened by abcdbosh - 0
您好,改怎么进行微调呢?
#21 opened by SinLT - 0
你好,有WavLM的中文预训练模型吗?
#20 opened by dengcunqin - 0
能期待下vq-wav2vec的自监督backbone吗?
#17 opened by splinter21 - 0
预训练超参mask_prob设置
#16 opened by 212wzt5A - 1
请问wenet speech中用于训练的100小时数据选取有技巧吗?还是任意选取都可以?
#14 opened by user-ZJ - 3
- 4
与原始版本预训练模型对比
#9 opened by zhangxueyangjuxie - 3
About fairseq checkpoint link
#6 opened by godiclee