Directly load `pth`/PyTorch tensor model files
philpax opened this issue · 2 comments
At present, llama.cpp
contains a Python script that converts pth
to ggml
format.
It would be nice to build it into the CLI directly, so that you can load the original model files. The original Python script could also be converted to Rust, so that we have a fully-Rust method of converting pth
to ggml models.
re loading pth
: serde-pickle looks quite promising, but we would need to figure out if it can load PyTorch tensors.
This is not complete yet. We've merged in the start of a converter, but more work is required to convert the weight.
Luckily, @KerfuffleV2's developed a Pickle parser that can handle PyTorch tensors: https://github.com/KerfuffleV2/repugnant-pickle
We should be able to use this to convert tensors to GGML format. In future, we can directly load tensors (I may separate that out into a new issue), but our focus is on loading tensors so that they can be quantised by #84 and used by llama-cli
.