260K Model Parameter count not right?
SpaceCowboy850 opened this issue · 1 comments
SpaceCowboy850 commented
Both from the internal reporting and from hacking the code, I'm seeing closer to 2.3M parameters for this model.
The vast majority of the parameters come from using the default 32,000 llama2 vocab size. Was the 260K model trained on a much smaller vocab size and just not reported (or more likely, maybe I missed it somewhere in the readme?)
karpathy commented
Yes it uses a custom much smaller vocab. Here are some docs that might help:
https://github.com/karpathy/llama2.c/blob/master/doc/stories260K.md