Dan-wanna-M/kbnf

Add `mask_token_ids` and `update_token_ids` API

Opened this issue · 1 comments

Some users might want to mask an array of token ids(e.g. from top_p, top_k) rather than the whole logits. We probably need the caller to provide an output buffer considering how the FFI works.

This will essentially stop cache from functioning. Probably should be implemented after eager regex cache.