Add function for fixing up bad data in UTF-8 VARCHAR fields
Closed this issue · 0 comments
waveform80 commented
There are several cases where a field is intended to be UTF-8 but an upstream source provides a mixture of ASCII, Latin-1, UTF-8, and/or has bad UTF-8 encoding (truncation, substitution, etc.). This is extremely difficult to fix within SQL itself; better to have a C-based UDF which will handle such things with a relatively simple algorithm.