ml-explore/mlx-swift-examples

LLMEval: Prevent crashing when model doesn't fit in memory

DePasqualeOrg opened this issue · 2 comments

When I run the LLMEval demo app on an iPhone, it crashes if the model is too big to fit into memory. Is there any way to prevent this from happening and gracefully show an error message? Crashing is not an option for real-world use, so I'd like to prevent this if possible.

Not prevent, but there are ways to avoid it. The system can tell you how much memory it might be willing to let you use:

In some cases the memory size reported here might be larger than the jetsam limit for your process. Make sure that you add this entitlement:

image

That doesn't guarantee you have that much memory available (per the docs) but it requests it.

Given that you can look at the size of the weights you are loading and estimate how much memory you will use (which you may have to figure by experimentation as the runtime requirements of each model varies). If you determine that it is too much given the device you are on you can show a message.

See also #66 with some notes about reducing memory use.

Excellent, thank you! This is very helpful. I hope to be able to contribute to this project once I get better acquainted with it.