sparklemotion/nokogiri.org-tutorials

discuss how to avoid prettyprinting

flavorjones opened this issue · 6 comments

requested on the nokogiri-talk mailing list

Current solution, which is pretty obscure and works only on Nodes, not eg. NodeSets:
doc.serialize(:save_with => 0)

Suggested syntax:
doc.to_xml(:prettyprint => false)

Which parallels the existing:
doc.to_xml(:indent => 0)

tozz commented

Doesn't work in 1.5.0

@tozz - I'm confused. What doesn't work?

tozz commented

Sorry about that, example:
https://gist.github.com/9580b0871874b610849c (github comments have some fetish for parsing html it seems)

Prettyprinting should never be default when using to_html() since parsers doesn't care about it (it would be better to enable it with :prettyprint => true if you for some reason need debug output).
The reason this is a problem is due to conversion between bbcode and html or having custom tags for site specific content in user input, you don't really want to mess around with additional new lines added for no good reason.

Ah, I see -- you're saying that in Nokogiri 1.5.0 under JRuby the save_with option doesn't appear to omit pretty printing. Correct?

This is pretty likely, I think, since the serialize options (which is the underlying implementation of to_html, etc.) are closely bound to libxml2, and the pure-java backend doesn't use libxml (it uses xerces and other libraries). And we don't have great test coverage for what we viewed as libxml2 implementation details.

So, what we need to do is create a test case out of this gist and kindly ask @yokolet to take a look at it. I'll create the failing test case.

tozz commented

Actually, this is for MRI (1.9.2-p180 lixml2 2.7.8).