large-v2 for english lost voice to text

Question

large-v2 for english lost voice to text

machenme opened this issue 9 months ago · 1 comments

when I use large-v2 for voice2text, I meet some lost
here is my data
https://youtu.be/zYC7tKfKPtM?si=Rm2ZK9Rez5E9CSlP
at 00:01:26 I lost all On the right-hand side, you put any expression that you want.
this is distil-whisper-large-v2

00:01:20,750 --> 00:01:23,770
An assignment statement has a name on the left-hand side.

24
00:01:23,890 --> 00:01:25,430
It can be any name that you invent.

25
00:01:30,240 --> 00:01:35,780
will evaluate that expression and bind it to the name. So now radius is bound

26
00:01:35,780 --> 00:01:44,250
to the value 10. Two times radius is 20. I can use that name when I bind other

here is faster-whisper-large-v2

00:01:20,710 --> 00:01:23,770
An assignment statement has a name on the left-hand side.

24
00:01:23,930 --> 00:01:25,430
It could be any name that you invent.

25
00:01:26,200 --> 00:01:29,340
On the right-hand side, you put any expression that you want.

26
00:01:29,720 --> 00:01:31,720
And Python will evaluate that expression

27
00:01:31,720 --> 00:01:33,600
and bind it to the name.

28
00:01:34,300 --> 00:01:36,660
So now radius is bound to the value 10.

Answer 1 · 2024-03-28T17:25:20.000Z

Could you try with distil-large-v3? It implements training improvements that should make it more performant in Faster-Whisper: https://huggingface.co/distil-whisper/distil-large-v3#faster-whisper