Loading model into RAM at prepare step is redundant
Closed this issue · 0 comments
BobaZooba commented
User had insufficient RAM for the prepare step at xllm-demo
project, it arises because, during this step, the model is downloaded and loaded into RAM. This approach is suboptimal, redundant, and may lead to similar instances that you've experienced. Simply downloading the model will suffice.
Link: BobaZooba/xllm-demo#1