Avoid character encoding issue in addPOIexec

Question

Avoid character encoding issue in addPOIexec

nattomi opened this issue 13 years ago · 4 comments

I had a character encoding problem with the name obtained from the Geonames service. The only way I could overcome this is to replace line
values["nearbyplace"] = "%s [%s]" % (name, country)
in addPOIexec with
values["nearbyplace"] = "%s [%s]" % (name.encode("utf-8"), country.encode("utf-8"))
I would recommend to make this modification -- or something equivalent -- otherwise bug reporting at locations with f.i. "ő" in their name won't work. Other characters can be problematic too, but that was the one I met with.

emka commented 13 years ago

Thanks!

Answer 1 · 2011-07-14T19:48:48.000Z

similarly, in getRSSfeed,
print "<title>%s (near %s)</title>%s%s?lat=%s&lon=%s&zoom=18%srssitem?id=%s%sgeo:lat%s/geo:latgeo:long%s/geo:long" % (type, c[6], desc, server_uri, c[2], c[1], api_uri, c[0], pubDate, c[2], c[1])
should be replaced with
print "<title>%s (near %s)</title>%s%s?lat=%s&lon=%s&zoom=18%srssitem?id=%s%sgeo:lat%s/geo:latgeo:long%s/geo:long" % (type, c[6].encode("utf-8"), desc, server_uri, c[2], c[1], api_uri, c[0], pubDate, c[2], c[1])

Answer 2 · 2012-03-04T15:27:59.000Z

Please review this change, as this apparently makes things even worse. See issue #29

Answer 3 · 2012-03-22T23:07:26.000Z

GeoNames is returning the place names in UTF-8 already. Why do they need to get encoded again?

It looks like the country is returned as ISO 3166-1 alpha-2 code. These two-letter codes range from U+0041 to U+005A (A-Z). It's not absolutely neccessary to encode them as UTF-8 here, because UTF-8 is backwards compatible to US-ASCII & ISO 8859-1 (latin1_* in MySQL) from U+0000 to U+007F.