plaintext Extract human languages in plain UTF-8 text from computer code and markup The output is (or should be) line-preserving, meaning, no new lines are added or subtracted. <p> foo </p> becomes foo