davedelong/CHCSVParser

Unable to auto-detect encoding

xanderdunn opened this issue · 1 comments

I get an "unable to determine stream encoding; assuming MacOSRoman" when trying to parse a CSV file using

CHCSVParser *parser = [[CHCSVParser alloc]
                         initWithContentsOfCSVFile:csvFilePath];

As expected, all the international characters are scrambled in my app after this.

My CSV File. It was created using CHCSVWriter, then exported to my Mac, and then imported again into the app through iTunes file sharing. The file was never saved in that process as far as I know.

It works correctly if I force UTF8 encoding using

NSStringEncoding encoding = NSUTF8StringEncoding;
CHCSVParser *parser = [[CHCSVParser alloc]
                         initWithInputStream:inputStream
                         usedEncoding:&encoding
                         delimiter:','];

As I suspected, sniffing the encoding wasn't accounting for multi-byte characters. Thanks for reporting this!