html2text
is a very simple script that uses Ruby's DOM methods to load HTML from a string, and then iterates over the resulting DOM to correctly output plain text. For example:
<html>
<title>Ignored Title</title>
<body>
<h1>Hello, World!</h1>
<p>This is some e-mail content.
Even though it has whitespace and newlines, the e-mail converter
will handle it correctly.
<p>Even mismatched tags.</p>
<div>A div</div>
<div>Another div</div>
<div>A div<div>within a div</div></div>
<a href="http://foo.com">A link</a>
</body>
</html>
Will be converted into:
Hello, World!
This is some e-mail content. Even though it has whitespace and newlines, the e-mail converter will handle it correctly.
Even mismatched tags.
A div
Another div
A div
within a div
A link
See the original blog post or the related StackOverflow answer.
Add the gem into your Gemfile and run bundle install
:
gem 'html2text'
Then you can:
require 'html2text'
text = Html2Text.convert(html)
See all of the test cases defined in spec/examples/. These can be run with:
bundle install
rspec
html2text
is licensed under MIT.
- html2text, the original PHP implementation.
- actionmailer-html2text, automatically generate text parts for HTML emails sent with ActionMailer.