Add `mask_token_ids` and `update_token_ids` API
Opened this issue · 1 comments
Dan-wanna-M commented
Some users might want to mask an array of token ids(e.g. from top_p
, top_k
) rather than the whole logits. We probably need the caller to provide an output buffer considering how the FFI works.
Dan-wanna-M commented
This will essentially stop cache from functioning. Probably should be implemented after eager regex cache.