dave-p/TVHadmin

error with certain search terms in search_epg function

Closed this issue · 5 comments

Hi,

I found a slight problem when I searched for the term "New" and have a workround below that works for me that I added into search_epg function. You may have a better way of doing this, but thought I'd pass this on if you wanted to incorporate it into TVHadmin (which I think is great by the way :-) )

cheers

$json = file_get_contents($url);
// fix any EM (x0019) CAN (x0018) chars that can cause json_decode to fail
$json = preg_replace ( '', "'" , $json );
$json = preg_replace ( '', "'" , $json );
$j = json_decode($json, true);

$error = json_last_error();
if ($error <> 0)
{
echo 'No Data - Error='.$error.'
';
}

update: I don't think the actual search term I used is the issue, only in that it returned a lot of data and increased the chances of the EM and/or CAN characters being returned - which json_decode can't handle (gives error code 3). I have found the Travel Channel and Food Network sometimes use these characters in their EPG data.

Hi, thanks for the report.

Are you using UK Freeview? I eventually came to the same conclusion, that there are occasional non-ASCII characters tripping up the JSON decode. On Freeview a programme called "Flipping Bangers" on BLAZE is causing the search crash, though on Freesat the invalid bytes don't appear.

What confused me more is that I have seen these 'highlight' bytes in the 'now & next' data on Freesat.

I'll read up on the JSON spec and work out a regex to remove all non-valid characters.

While looking through search.php I also found a stupid cut 'n' paste error which I'll also fix.

Hi,

yes using UK Freeview. I'd already encountered this problem on a fairly basic TVH frontend I was writing before I found yours. Mine's more just an EPG really, in a format I like as opposed to the TVHeadend one, that I started writing when the online EPG at http://xmltv.radiotimes.com ceased to be.

For me it was always the Food Network or Travel Channel that caused it and it was always one or other of those two characters, usually inserted where apostrophes would normally be. On your search it was doing a really wide search on something like "new" or "and" that triggered it, but I don't know which channels.

In my local copy of your search.php code I've now also placed an "if (is_array($results))" check between the two lines

	$results = search_epg("", $find);
	foreach ($results as $r) {

I've really got into TVHeadend lately, upgraded to an HD tuner, and use Kodi with the pvr-hts client to watch stuff. Was also pleased to get comskip working with it this week.

I'll keep an eye out for any updates you make.

I had planned to raise a bug report against TVHeadend, on the basis that it should not be emitting unparseable JSON, but looking deeper it does seem to be the broadcasters' fault. The Unicode code points for left and right single-quote are \x2018 and \x2019 respectively, and the lower bytes of these are somehow getting into the EPG. Here's a nice rant on the situation:

https://ukfree.tv/transmitters/tv/Winter_Hill/PGSTART1790/irt836083#b836083

At the moment there are 66 broken EPG records, all on the 'minor' channels, and in each case the problem bytes are in the Summary field.

I've pushed out a slightly more generic fix for the issue. Thanks again. Closing.

That's better than my quick fix as it's generic as you say.

Replacing with null char does now lose the apostrophe - eg "Antony and Cleopatra Charlton Hestons classic" - and "Jump the Whale Shark/Frédéric Bartholdi" becomes "Jump the Whale Shark/Frdric Bartholdi" - but better than trying to work out what every dodgy character should be and trying to replace it. Thanks.