dlite-tools/NLPiper

Global Document Statistics

Opened this issue ยท 0 comments

๐Ÿš€ Feature

Calculate statistics around the processed data.

Motivation

Knowing global statistics for the processed document could be of great interest, such as the number of chars, tokens, processed documents, reduction using cleaners, etc.