New convert.py creating massive files ?
hippalectryon-0 opened this issue · 3 comments
Using the "old" convert.py (https://raw.githubusercontent.com/ggerganov/llama.cpp/master/convert.py):
ggml-model-q4_0.bin
(4Gb) -> new.bin
(4Gb), takes a few seconds
Using the "new" convert.py (the one in main):
ggml-model-q4_0.bin
(4Gb) -> modelsnew.bin
(26Gb !!!), takes a few minutes
What's going on ? ^^
Also this might be caused by a newer LlamaCpp unit. Get back to the old version until fixed.
Found the culprit @alxspiker :P
Line 1178 in 6eed358
i'm not sure why we hardcoded "f32" but i think it should be None
.
i'll open a PR soon.
Found the culprit @alxspiker :P
Line 1178 in 6eed358
i'm not sure why we hardcoded "f32" but i think it should be
None
.i'll open a PR soon.
Honestly just took an existing convert file and just change some things. Don't know much about models in general. I just know how to load and use the llama7b model.