rustformers/llm

Directly load `pth`/PyTorch tensor model files

philpax opened this issue · 2 comments

At present, llama.cpp contains a Python script that converts pth to ggml format.

It would be nice to build it into the CLI directly, so that you can load the original model files. The original Python script could also be converted to Rust, so that we have a fully-Rust method of converting pth to ggml models.

re loading pth: serde-pickle looks quite promising, but we would need to figure out if it can load PyTorch tensors.

This is not complete yet. We've merged in the start of a converter, but more work is required to convert the weight.

Luckily, @KerfuffleV2's developed a Pickle parser that can handle PyTorch tensors: https://github.com/KerfuffleV2/repugnant-pickle

We should be able to use this to convert tensors to GGML format. In future, we can directly load tensors (I may separate that out into a new issue), but our focus is on loading tensors so that they can be quantised by #84 and used by llama-cli.