$.html()
is not a DOM serializer, $.htmlize()
is.
document.body.innerHTML = '<input>';
document.body.firstChild.value = 'foo';
$('body').html() === '<input>';
$('input').htmlize() === '<input value="foo">';
$('#serialize-me').htmlize();
By default, it returns an outerHTML
concatenation from node's clone because the plugin is playing with attributes. You can override this with the following options :
$('#serialize-me').htmlize({innerHTML: true, clone: false});
In order to create PDF export from the DOM via wkhtmltopdf I needed to serialize the DOM. The context was a Web App I'm still working on, running with Backbone.js, with lots of form elements and calculations, like spreadsheets…
So because of client-side templating and user interaction, what I needed wasn't the HTML sent by the server, but exactly the same DOM the user had in front of his eyes.
It quickly appears that element.innerHTML
didn't fit my needs. If you use innerHTML
in a form, you'd see that fields (input
, textarea
, select
…) are badly serialized : some attributes aren't updated, value
for instance but also disabled
, selected
…
I searched and tried many ways before writing this plugin : for instance the XMLSerializer
Object, available on Firefox & Webkit browsers, but having the same problem than innerHTML
.
Without a solution after my researches, I thought a moment about traversing and serializing "by hand" every nodes, it sounded too bad so I just forgot about it and tried to find a smarter way. Thing is, innerHTML
does work well most of the time…
So, I started to play with DOM nodes and see how innerHTML
behaves when attributes are changing. It seems like only fields are problematics and only some attributes, I guessed the cause was the way of browsers are making them, like iframes or whatever. I also found that the value
attribute doesn't behave the same depending on the input's type.
For instance with a text field, the value
is not updated in innerHTML
, but it is with checkboxes ! It kind of makes sense for me, browsers doesn't need to reflow checkboxes when the value changes, but it does with text fields. So I tried with the checked
attribute, my guess was write : as checkbox needs reflow when checked/unchecked, this attribute wasn't update in innerHTML
.
My first hack idea was almost there : storing a copy of those risky attributes in safety attribute so I decided to go with data-\*
. As I know which attributes from which nodes were problematics, I should just select those nodes, get those attributes, back them up, escaping them from innerHTML
and restore them from data-\*
.
Let's say we have an input field with default value to "foo". I use outerHTML
for better comprehension, but it doesn't work on Firefox, just sayin'…
input.outerHTML === '<input value="foo">';
The user changes it to bar.
input.outerHTML === '<input value="foo">';
but
input.value === 'bar';
Backup the risky attribute to a safety attribute
input.innerHTML === '<input value="foo" data-backup-value="bar">';
One RegExp later
<input data-backup-value="bar">
Another RegExp later
<input value="bar">
And here it is ! Good serialization. Things could have been easier but here are two more things I needed to do :
- Doing the hack on a clone node rather than the node itself as I'm adding some extra attributes.
- Handle the
textarea
special case :value
andinnerHTML
are very closed but different. Only thevalue
property is changing, not theinnerHTML
so it needsthis.innerHTML = this.value;
While writing this README, I made some other tests… I discovered that it isn't only because of the attributes kind, it's also the way of setting them ! Actually, it's a story about attributes and properties, anyway…
input.outerHTML === '<input value="foo">';
input.value = 'bar';
input.outerHTML === '<input value="foo">';
input.setAttribute('value', 'foobar');
input.outerHTML === '<input value="foobar">';
The weird thing is…
input.value === 'bar';
But we don't care ! We only need to do :
input.setAttribute('value', input.value);
Actually, it's just like resyncing the attributes : no more RegExp
, no more data-\*
attributes tricks, only resyncing.
So that's the 2nd idea and current implementation : resyncing.
I kept the clone abstraction because in some case you could need to access the initial attributes, as default values references for instance.
Hope it would help !