This is a fork of https://github.com/facebookresearch/llama that runs on CPU.
Please refer to the official installation and usage instructions as they are exactly the same.
The 7B model infers 1 word per ~1.5 secs on a MacBook Pro M1.
This is a fork of https://github.com/facebookresearch/llama that runs on CPU.
Please refer to the official installation and usage instructions as they are exactly the same.
The 7B model infers 1 word per ~1.5 secs on a MacBook Pro M1.