Post processing of texts after PDF text extraction in preparation for use as training files.
The full report on this project's research can be found at the following link: https://docs.google.com/document/d/1Y8cTQslwrnYaNJeJIWJnk4eJn-KIKhcbxqM6Cf9bm5g/edit?usp=sharing