Implement the input tokenizer in Fortran
certik opened this issue · 0 comments
certik commented
Currently the input tokenizer is in Python, taken from the original OpenAI's implementation: https://github.com/certik/fastGPT/blob/01eb84b015d89a567245da0445c0abb7d53a8500/encode_input.py. We should implement it in Fortran. That will eliminate the need to call the Python script before running fastGPT
.
We have to write tests that exercise each code path in the Python implementation to ensure our Fortran implementation is correct.