RUCKBReasoning/codes

Tokenize error

Closed this issue · 2 comments

When I executed the tokenize_pt_corpus.py script to tokenize the corpus, it encountered an error,
Screenshot 2024-04-06 at 16 40 28

This error likely occurred due to insufficient memory available during the execution of tokenize_pt_corpus.py.

Thanks. I solved the problem finally.