Bug: Transscribing Media ends with exlamation marks
Opened this issue · 10 comments
What happened?
The transcript of a 1h multi speaker file generates the following output:
00:00 --> 01:20
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:20 --> 01:28
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:28 --> 01:39
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:40 --> 01:41
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:43 --> 01:44
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:44 --> 01:54
Speaker 1:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
01:54 --> 01:57
Steps to reproduce
- step one, load a file larger than 1h into the app
- step two, set speaker amount to 8, language german
- start transcription
I use a Amd 7700XT, maybe thats the reason
What OS are you seeing the problem on?
Window
Relevant log output
App Version: vibe 2.6.3
Commit Hash: d24ffccb0d05ea822ff1a3a6edb3b9871be9f368
Arch: x86_64
Platform: windows
Kernel Version: 10.0.19045
OS: windows
OS Version: 10.0.19045
Cuda Version: n/a
Models: ggml-medium.bin
Default Model: "C:\\Users\\Me\\AppData\\Local\\github.com.thewh1teagle.vibe\\ggml-medium.bin"
Cargo features: vulkan
{
"avx": {
"enabled": true,
"support": true
},
"avx2": {
"enabled": true,
"support": true
},
"f16c": {
"enabled": true,
"support": true
},
"fma": {
"enabled": true,
"support": true
}
}
Please show me example youtube video that it happens with or upload audio and show me what language to choose so I can reproduce it
Hi, the language doesnt really matter, whether i chose "auto detect language", "german" or "english", its all excamation marks.
Regarding the audio and video: also doesnt matter in my case, different files / formats all resulted in the same problem.
I even changed from AMD Pro drivers to Gaming drivers, nothing changed.
I am sure you will be able to transcribe anything fine, just like I am on the CPU model ( except that its really slow)
Anything else I can provide to help?
Maybe related to ggerganov/whisper.cpp#2400
I have the same issue for transcribing audio clips longer than ~8 seconds. Vulkan build, 7900XTX, Windows 10.
Could it be related to this issue?
ggerganov/llama.cpp#10434
Could it be related to this issue?
Totally. do you experience the same issue? I can try update whisper.cpp in vibe and release beta version
I released beta version with the new code
https://github.com/thewh1teagle/vibe/releases/download/v2.6.7/vibe_2.6.7_x64-setup.exe
Let me know if the problem fixed
Hi,
thanks for the update.
First attempt transcribed "audio audio audio audio audio" then crashed
second one failed instantly with "Boundary error:
Error: Non-negative timestamp expected"
After that, I couldnt close the "Error A bug happened" Field, even when clicking "close".
Happens with various audio file inputs after a few seconds, even after reinstalling. GPU usage doesnt go above 3-4%
With <=2.6.6 I experience the same problem with exclamation marks, with 2.6.7 version I get "pulp" or negative timestamp error depending in audio file. I have AMD Ryzen 9 7940HS with IPU but not GPU.
I had the same problem. What worked for me was to uninstall amdvlk and lib32-amdvlk drivers,
and leave only vulkan-radeon and lib32-vulkan-radeon drivers.
On Archlinux.