tilo/smarter_csv

Warning about UTF-8

hirowatari opened this issue · 3 comments

Upon upgrading from 1.9.2 to 1.10.0 I started seeing the error
WARNING: you are trying to process UTF-8 input, but did not open the input with "b:utf-8" option. See README file "NOTES about File Encodings".

Looking into it further, it seems that this error is from lib/smarter_csv/smarter_csv.rb. On line 19

@enforce_utf8 = options[:force_utf8] || options[:file_encoding] !~ /utf-8/i

perhaps this should be

@enforce_utf8 = options[:force_utf8] || options[:file_encoding] =~ /utf-8/i

(though perhaps I don't understand what's going on in that function well enough).

Here is the relevant file: location-iso-8859-1.csv
This is the relevant code where the warning is thrown:

file_encoding = 'ISO-8859-1'
smarter_csv_options = { chunk_size: CHUNK_SIZE, 
                        remove_empty_values: false,
                        convert_values_to_numeric: false,
                        duplicate_header_suffix: nil, # raise error if there are duplicate headers
                        file_encoding:,
                        value_converters: {
                          phone: CsvConverters::BlankToNil,
                          name: CsvConverters::ReplaceSpecialCharacters,
                          address: CsvConverters::ReplaceSpecialCharacters,
                          city: CsvConverters::ReplaceSpecialCharacters,
                          state: CsvConverters::ReplaceSpecialCharacters,
                          zip: CsvConverters::ReplaceSpecialCharacters,
                          country: CsvConverters::ReplaceSpecialCharacters,
                          website: CsvConverters::ReplaceSpecialCharacters
                        } }
SmarterCSV.process(File.open(@file, mode: 'r', encoding: file_encoding), smarter_csv_options) do |chunk|
  # ...
end 
tilo commented

@hirowatari thank you for reporting!

Will fix this in 1.10.1 today or tomorrow

@tilo Thanks. I especially appreciate the speedy release.