Varnish <esi> tags are being stripped by DOMDocument
Closed this issue · 4 comments
Hi,
We are using varnish as a reverse proxy for ESI blocks (even on AMP pages) and this library alters <esi:include>
tags, so varnish is not able to recognize them.
we would like to be able to configure amp-toolbox-php to tolerate the presence of esi:* tags
cf https://docs.oracle.com/cd/B14099_19/caching.1012/b14046/esi.htm
Yes, this is a known problem with DOMDocument
, as the <esi>
tags can end up in place that make the document invalid HTML5.
We already have code in place for other constructs that DOMDocument
's parsing chokes on, so we can add <esi>
tags as another of these edge cases to support.
Then later in validation we'd need a way to ignore esi:include
tags. This would need to be opt-in.
@schlesseracould you hint me on these "others constructs", so I might contribute a solution.
@killerwolf These happen in AmpProject\Dom\Document
.
We have some transformations that we do before the markup is turned into a DOM here: https://github.com/ampproject/amp-toolbox-php/blob/0.5.2/src/Dom/Document.php#L506-L511
Then we revert some of these transformations again when we go back from DOM to markup: https://github.com/ampproject/amp-toolbox-php/blob/0.5.2/src/Dom/Document.php#L608-L613
Here's a rough example of how the fix should be done: contao/core-bundle#1360 (comment)
However, I'd refrain from using a random ID (which can theoretically lead to conflicts) and instead using and incrementing index instead.