
Script to split documents into sentences.

Primary LanguagePythonApache License 2.0Apache-2.0

Sentence Splitter

Script to split documents into sentences.


conda env create
conda activate sentence-splitter
spacy download en_core_web_sm


Pass an input file with one document per line:

./sentence_splitter.py INPUT_FILE > OUTPUT_FILE

The output will be one sentence per line, and documents will be separated by an empty line. You can alternatively pass the input file in stdin.

To see more information about the script and its available options, run:

./sentence_splitter.py --help