JonathanReeve/chapterize
A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books for computational text analysis.
PythonGPL-3.0
Issues
- 1
License
#9 opened by fakerybakery - 2
a minor error in code
#5 opened by eveliao - 1
Chapter title variations not working
#8 opened by green345 - 0
Write tests and set up CI
#7 opened by JonathanReeve - 7
War and Peace doesn't parse
#6 opened by palewire - 0
Integrate HTML-based chapterizer
#4 opened by JonathanReeve - 1
Parse short stories
#3 opened by nateGeorge - 0
implement log mode that outputs chapter data instead of actually chapterizing
#2 opened by JonathanReeve - 0
use a different word tokenizer that doesn't require external data to be downloaded
#1 opened by JonathanReeve