cjpais/whisperfile

ffmpeg in path but not detected by whisperfile on Windows 10

Opened this issue · 12 comments

I cannot start the whisperfile even though ffmpeg is definitely in the path. have tried running cmd as administrator as well as copying the ffmpeg.exe file to the same directory with same results every time.

image

thanks for this issue, ill try to replicate in a vm and see what can be done. i don't have a windows primary machine so it may take me a little time

I am happy to help test, just let me know what I can do.

im running behind on a few things and haven't got a windows vm set up yet.. by any chance have you tried WSL?

Unfortunately I dont use WSL...

It seems to me somehow the system() call is somehow possibly failing?

I checked the source here:

int result = system("ffmpeg -version");

image

this is not a solution but may help you get running in the short term if you dont need --convert specifically.

would you mind trying the release 0.1.2?

make sure to download a .bin file in addition.

you can use it like whisperfile-0.1.2 -m whisper.large-v3.bin

the whisperfiles I've upload to HF have the --convert flag as part of the run command. If you don't need that specific feature, this should work around the ffmpeg dependency before i can get a windows machine up and running for development and testing

I was trying to do this now but I cant seem to find whisper.large-v3.bin

The model on huggingface has several bin files in the files repo but how can I know which to use?

Those one's likely will not work directly. I would try the one's I've made for whisper.cpp

https://huggingface.co/cjpais/whisperfile/tree/main

Maybe this one

Can confirm same issue on windows 11. I do need the convert functionality and I'm using the whisperfile without model built in

Potentially you could allow us to pass in the path explicitly for now

OK, I had this too. Here is a workaround.

Just do the conversion with ffmpeg yourself first. raw_audio_file variable is just a path string for the file.

    async def convert_audio(self, raw_audio_file):
        audio_path = os.path.join(".", "temp.wav")
        # Run ffmpeg command to convert audio to WAV format
        cmd = [
                "ffmpeg",
                "-nostdin",
                "-threads", "0",
                "-i", raw_audio_file,
                "-acodec", "pcm_s16le",
                "-ar", "16000",
                "-ac", "1",
                "-f", "wav",
                audio_path
            ]

        subprocess.run(cmd, stderr=subprocess.DEVNULL, check=True)

Also here is a function to call whisper.

import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder

def call_inference_api(file_path, temperature=0.0, response_format='text'):

    # Create a MultipartEncoder object
    multipart_data = MultipartEncoder(
        fields={
            "file": ("temp.wav", open(file_path, "rb"), "audio/wav"),
            "temperature": str(temperature),
            "response_format": response_format
        }
    )

    # Set the headers
    headers = {
        "Content-Type": multipart_data.content_type
    }

    # Make the POST request
    response = requests.post(
        'http://127.0.0.1:8080/inference',
        headers=headers,
        data=multipart_data
    )

    response.raise_for_status()
    return response.text`

Still the same error on Windows 10