This is a new set of tools to do common tasks on the OSCAR corpus
The program has a different set of tools for each corpus version:
v1
: OSCAR 2019-like, text only (.txt files)v2
: OSCAR 22.01-like, JSONLines, document-oriented with annotations and line-level identifications