digital-preservation/csv-validator

Encoding

eilandnl opened this issue · 2 comments

Would it be possible to do something with encoding maybe give it as a flag --encoding utf-8-bom

Now i am unable to validate utf-8-bom files

We use the Univocity CSV Parser internally. It does look like we could add a flag as you suggest which would instead use http://docs.univocity.com/parsers/2.7.3/index.html?com/univocity/parsers/common/input/BomInput.html

As a workaround there are lots of tools available that you can use to strip any BOM from the files before processing them - just do a Google search fro "strip BOM"