TS Optimization: make the LRU cache pluggable and faster

Question

TS Optimization: make the LRU cache pluggable and faster

connor4312 opened this issue 9 months ago · 0 comments

In my performance metrics, once #35 is merged, about 25% of tokenization time is spent inside of calls to the lru-cache module. While this module is quite nice and flexible, it may not be best optimized for use patterns in the Tokenizer. It should be swappable, and the tokenizer module could also include a fast simple LRU cache.