OvertureMaps/data

Places names contain control characters

Closed this issue · 3 comments

Some of the place names contain special characters (control characters) like e.g. 'backspace'.
This can lead to unexpected side effects when further processing this data.

For example, there are 220 (out of 60 Mio) POIs with 'backspace' character:
select names['common'][1].value as name from places where contains(name, chr(8));

I suggest to remove control characters from POI names before publishing as parquet release.
Do you see any down sides?

I've removed the backspace characters. I have not performed a careful search for other control characters that might be in the names.

Thanks!
I assume the change will be included in the next official version?

Closed as fixed. Feel free to re-open or start a new issue with any remaining questions or other issues.