ffmpeg in path but not detected by whisperfile on Windows 10
Opened this issue · 12 comments
thanks for this issue, ill try to replicate in a vm and see what can be done. i don't have a windows primary machine so it may take me a little time
I am happy to help test, just let me know what I can do.
im running behind on a few things and haven't got a windows vm set up yet.. by any chance have you tried WSL?
Unfortunately I dont use WSL...
It seems to me somehow the system()
call is somehow possibly failing?
I checked the source here:
whisperfile/whisper.cpp/server/server.cpp
Line 212 in 37c5046
this is not a solution but may help you get running in the short term if you dont need --convert
specifically.
would you mind trying the release 0.1.2?
make sure to download a .bin
file in addition.
you can use it like whisperfile-0.1.2 -m whisper.large-v3.bin
the whisperfiles I've upload to HF have the --convert
flag as part of the run command. If you don't need that specific feature, this should work around the ffmpeg dependency before i can get a windows machine up and running for development and testing
I was trying to do this now but I cant seem to find whisper.large-v3.bin
The model on huggingface has several bin files in the files repo but how can I know which to use?
Those one's likely will not work directly. I would try the one's I've made for whisper.cpp
https://huggingface.co/cjpais/whisperfile/tree/main
Maybe this one
Can confirm same issue on windows 11. I do need the convert functionality and I'm using the whisperfile without model built in
Potentially you could allow us to pass in the path explicitly for now
OK, I had this too. Here is a workaround.
Just do the conversion with ffmpeg yourself first. raw_audio_file variable is just a path string for the file.
async def convert_audio(self, raw_audio_file):
audio_path = os.path.join(".", "temp.wav")
# Run ffmpeg command to convert audio to WAV format
cmd = [
"ffmpeg",
"-nostdin",
"-threads", "0",
"-i", raw_audio_file,
"-acodec", "pcm_s16le",
"-ar", "16000",
"-ac", "1",
"-f", "wav",
audio_path
]
subprocess.run(cmd, stderr=subprocess.DEVNULL, check=True)
Also here is a function to call whisper.
import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder
def call_inference_api(file_path, temperature=0.0, response_format='text'):
# Create a MultipartEncoder object
multipart_data = MultipartEncoder(
fields={
"file": ("temp.wav", open(file_path, "rb"), "audio/wav"),
"temperature": str(temperature),
"response_format": response_format
}
)
# Set the headers
headers = {
"Content-Type": multipart_data.content_type
}
# Make the POST request
response = requests.post(
'http://127.0.0.1:8080/inference',
headers=headers,
data=multipart_data
)
response.raise_for_status()
return response.text`
Still the same error on Windows 10