larsyencken/csvdiff

compare CSV(with BOM format), and use ignore_columns for first column will prompt error sequence[key].pop(i)

Closed this issue · 2 comments

error message:
Traceback (most recent call last): File "C:\Python27\Scripts\csvdiff-script.py", line 9, in load_entry_point('csvdiff==0.3.1', 'console_scripts', 'csvdiff')() File "C:\Python27\lib\site-packages\click\core.py", line 716, in call return self.main(*args, **kwargs) File "C:\Python27\lib\site-packages\click\core.py", line 696, in main rv = self.invoke(ctx) File "C:\Python27\lib\site-packages\click\core.py", line 889, in invoke return ctx.invoke(self.callback, **ctx.params) File "C:\Python27\lib\site-packages\click\core.py", line 534, in invoke return callback(*args, **kwargs) File "C:\Python27\lib\site-packages\csvdiff-0.3.1-py2.7.egg\csvdiff_init_.py", line 151, in csvdiff_cmd sep=sep, ignored_columns=ignore_columns) File "C:\Python27\lib\site-packages\csvdiff-0.3.1-py2.7.egg\csvdiff_init_.py", line 181, in _diff_and_summarize diff = patch.create(from_records, to_records, index_columns, ignored_columns) File "C:\Python27\lib\site-packages\csvdiff-0.3.1-py2.7.egg\csvdiff\patch.py", line 208, in create from_indexed = records.filter_ignored(from_indexed, ignore_columns) File "C:\Python27\lib\site-packages\csvdiff-0.3.1-py2.7.egg\csvdiff\records.py", line 52, in filter_ignored sequence[key].pop(i) KeyError: u'aa'

error prompt from sequence[key].pop(i), need some operation for BOM CSV compare as following
sequence[key].pop(i.encode('utf-8-sig'))

I'm using Python's native csv module and native UTF-8 handling. Whilst we could allow CSV files with this kind of encoding, can I recommend instead that you just convert them to normal UTF-8 without BOM before doing a diff?

covert is a good idea, thanks for reply!