Trips up some parsers
aleemb opened this issue · 3 comments
There are a few issues I ran into, possibly related to the use of htmlentities()
as opposed to using htmlspecialchars()
as outlined in http://stackoverflow.com/questions/2822774/php-is-htmlentities-sufficient-for-creating-xml-safe-values
I believe the references to htmlentities
should be replaced with htmlspecialchars
.
Can you post examples?
I did a bit of code gymnastics a while back so I don't remember the exact error condition but it possibly had to do with using an ndash and single-quote in the same string:
htmlentities("–'s"); // –
htmlspecialchars("–'s"); // –
As per pre-defined XML entities:
< < less than
> > greater than
& & ampersand
' ' apostrophe
" " quotation mark
Anything else need not be html encoded.
In my case it was an iOS application that was parsing the XML and showing the single-quotes as '
instead of an actual single-quote, or something along those lines. This was happening because I was escaping the string manually as well htmlentities
. So it could very well be that the string was being double-encoded. Either way, the issue is gone now since I am escaping the string using htmlspecialchars
but I noticed you are still using htmlentites
in your code.
To add to the confusion, I think there is an additional better practise, which is to write:
// better since this will not result in double encoding but still encodes once
htmlentities(html_entity_decode($foo));
// can possible result in double encoding if use already encoded $foo
htmlentities($foo);
The first call is idempotent, the second is not.
As php-rss-writer doesn't use htmlentities
, I couldn't understand what issue exists. I close this issue but you can feel free to reopen this issue.