MacBook Pro M3 Crashes

Question

MacBook Pro M3 Crashes

aaronedell opened this issue 10 months ago · 5 comments

When i try

--engine WHISPER_LARGE \
--dataset LIBRI_SPEECH_TEST_OTHER \
--dataset-folder /Users/aedell/Documents/GitHub/speech-to-text-benchmark/datasets/LibriSpeech/test-other  \

my Macbook Pro's memory spikes and then the machine locks up. According to ChatGPT, the issue is;

Concurrent Execution: The script uses ProcessPoolExecutor for parallel processing. If the dataset is large and the number of processes (num_workers) is not optimally set, this could lead to high memory usage as each process might consume a significant amount of memory.

Dataset and Engine Initialization: Depending on how the Dataset and Engine classes are implemented (not visible in the provided script), loading large datasets or initializing multiple instances of speech-to-text engines could consume a lot of memory.

Memory Leaks: If there are any memory leaks within the Engine or Dataset classes (e.g., not properly releasing resources), repeated calls in a loop could exacerbate memory consumption over time.

Any suggestions?

Answer 1 · 2024-02-15T22:39:36.000Z

Do you get the same issue when running other engines or datasets? Or is it only this combination of arguments?

Answer 2 · 2024-02-16T16:10:27.000Z

I tried it with LibriSpeech/test-clean and got the same result. Here is a screengrab of my activity monitor before the crash.

Answer 3 · 2024-02-16T16:18:04.000Z

However, when I try with WHISPER_SMALL I don't have the issue.

Answer 4 · 2024-02-16T18:16:45.000Z

I see. Whisper large is a big model and takes a lot of RAM to run. By default we run a thread for each CPU count, and it has a copy of the model on each thread. Try reducing the amount of threads used in the benchmark using the --num-workers argument? Try something like 7 or 8? I imagine then it will run fine.

Answer 5 · 2024-03-11T19:09:09.000Z

Closing due to inactivity