llm-jp/llm-jp-corpus

Add the `token_ids` field

Closed this issue · 1 comments

What

llm-foundry enjoys a benefit when the corpus provides a sequence of token IDs.

How

Include the token_ids field in the resultant corpus.