tazz4843/whisper-rs

Unable to use Metal feature on Mac M1 Max (32 GB)

valiksb opened this issue · 17 comments

I'm fairly new to Rust and I wanted to start with a project that allows me to learn it while I also do something that I like and that is Whisper. whisper-rs seems like the ideal solution for this but I got a bit into some problems, see below

  1. I created a brand new Rust project and copied https://github.com/tazz4843/whisper-rs/blob/master/examples/audio_transcription.rs and renamed it main.rs
  2. made the suggested changes to my Cargo.toml
    [dependencies] hound = "3" whisper-rs = { version = "0.10.0" }
  3. Then I did cargo run
  4. It worked fine but it took took long compared to whisper.cpp

I added this line println!("[{}]", print_system_info()); to see what was been used and this is the output
[AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 0 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | ]

then I enable Metal with
features = ["metal"] and run again, I got this sysinfo and a ASSERT
[AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | ]
and this is the ASSERT
ggml_metal_graph_compute: command buffer 0 failed with status 5 GGML_ASSERT: /path_to/my_whisper_sample/target/release/build/whisper-rs-sys-2c6c0c9736fdf6b6/out/whisper.cpp/ggml-metal.m:1611: false

I dont believe this to be a bug, but I was not sure were to post this question/issue I'm having

Thanks in advance

I've not used Metal much so am unable to help much here. Perhaps modifying build.rs to unconditionally build Metal might help?

thanks. Let me try that

Did you figure anything out? I learned that if you copy the files explicitely ggml-metal.h, ggml-metal.m, ggml-metal.metal into the current working directory, (I think whisper.cpp) it will try a few paths and eventually load them. But inference doesn't work (produces garbage output). This is frustrating as I would rather use rust for my programs compared to c++.

(I have since switched to using coreml)

(I'm using an M1 Pro on 13.6.2)

I'm new to debugging Rust lib build steps, so I'm sure there's terminology I'm not using that would make this clearer!

I tried cloning this project and adding a missing file (ggml-common.h) to the include list in the sys directory's Cargo.toml and now I find that the main issue is the inflexible way that whisper.cpp tries to source the Metal files.

At runtime, it either looks in an environment variable or falls back to the current directory, neither of which correspond the location of the Metal files, which end up in target/.

I'm not sure if whisper.cpp needs patching to fix this, or if there is some magic we can do in the whisper-rs build process to make runtime behavior work cleanly and out of the box.

But I can report that when I set the GGML_METAL_PATH_RESOURCES environment variable at the right target/ subfolder (specifically the out directory under whisper-rs-sys), inference works properly and with the expected speed.

Also - the log trampoline doesn't seem to capture the GGML_METAL_LOG_{LEVEL} calls.

But I can report that when I set the GGML_METAL_PATH_RESOURCES environment variable at the right target/ subfolder (specifically the out directory under whisper-rs-sys), inference works properly and with the expected speed.

If this is all it takes then this might be an easy fix. I can draft a PR.

Oh awesome - I wasn't sure if we'd really need to set this environment variable at runtime to make it work, or if we do, whether that's acceptable practice.

If we need to set it at runtime I think it would be best to raise an upstream issue, but I think we'll be able to do it in build.rs without issues based off what you've said.

See 39042a8, try branch try-fix-metal-build to see if it works.

Awesome - If I understand correctly now, I think my original description was slightly off - GGML_METAL_PATH_RESOURCES should be the whisper.cpp dir, not its parent(?) - the build dir for whisper-rs-sys.

Sorry that took me a bit to get back to: maybe 1e3adf1

@tazz4843: I am personally getting the following on 1e3adf1.

ggml_metal_init: allocating
ggml_metal_init: found device: Apple M1 Pro
ggml_metal_init: picking default device: Apple M1 Pro
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: error: could not use bundle path to find ggml-metal.metal, falling back to trying cwd
ggml_metal_init: loading 'ggml-metal.metal'
ggml_metal_init: error: Error Domain=NSCocoaErrorDomain Code=260 "The file “ggml-metal.metal” couldn’t be opened because there is no such file." UserInfo={NSFilePath=ggml-metal.metal, NSUnderlyingError=0x600000f1d980 {Error Domain=NSPOSIXErrorDomain Code=2 "No such file or directory"}}
whisper_backend_init: ggml_backend_metal_init() failed

So it doesn't seem to be picking it up with the new change.

I can also pass it manually when running the final binary e.g.
Pointing the environment variable when running cargo run to target/release/build/whisper-rs-sys-f18f04168e33a9e5/out/whisper.cpp/; and then I get the expected:
ggml_metal_init: GGML_METAL_PATH_RESOURCES = target/release/build/whisper-rs-sys-f18f04168e33a9e5/out/whisper.cpp/
Everything works fine and seems performant. I used a trivial example, so I did not really measure the performance or anything.

I think the variable needs to be set at runtime. From the code @dev-msp linked https://github.com/ggerganov/whisper.cpp/blob/ac283dbce7d42735e3ed985329037bf23fe180aa/ggml-metal.m#L333, I am not sure there is a way to properly fix this in build.rs -- although I am just really quickly skimming this codebase, and can't offer any informed ideas yet.

It seems this is more something that should be part of the WhisperContext. (Cf. I am trying to figure out how one can generate a statically linked shippable binary from the example @valiksb shared.)

P.S. I am on 14.0 on M1 Pro. @dev-msp -- I am guessing you are getting the same thing, but I am curious if there are any discrepancies.

P.S.2: ggerganov/llama.cpp#5376 Perhaps related? And the build script should embed metallib? (i.e. via https://github.com/ggerganov/whisper.cpp/blob/ac283dbce7d42735e3ed985329037bf23fe180aa/ggml-metal.m#L322)

@eftychis Yes, getting same thing on same hardware, though I'm on 13.6. Also runs fine when the variable is set at runtime.

My solution is to just copy the "ggml-metal.metal" file from out dir where whisper.cpp folder exists to the CWD
but again GGML_METAL_LOG is not captured by whisper_log_trampoline in the rust side

#148
I have just created a naive fix. Anyone can have a try here?

Can confirm that @hlhr202's fix is working for me on the latest commit on master with the following in my cargo.toml.

whisper-rs = { git = "https://github.com/tazz4843/whisper-rs.git", branch = "master", features = ["whisper-cpp-log", "metal"] }