sgraaf/Replicate-Toronto-BookCorpus

Question: What about Project Gutenberg as an alternative source?

Closed this issue · 1 comments

I appreciate the suggestion! I looked into this myself previously (per Google's suggestion), but consider it beyond the scope of this repository. I say this because this repository serves as a means of creating a faithful replica of the original TBC dataset, which cannot be accomplished with books from Project Gutenberg (mostly because the books on Project Gutenberg are (understandably) old, and thus their writing styles are old/dated).