kevinseim/beanio

Enable XML mapping file to identify and ignore Byte Order Mark (BOM)

Opened this issue · 0 comments

I use the 2.1.0 release of beanio to extract data from .csv files.

If the file comes from Windows, the first three bytes of the file contain a “Byte Order Mark” (in my case ef bb bf) that will crash the beanio reader.

In order to process Windows files I have to first run them through some external program to remove the BOM.

I’ve searched the documentation and unless I’ve missed something it appears that the ability to do this is not available in this release.

XML snippet below:

<stream name=“inputFile" format="csv">
   <parser>
      <property name="delimiter" value="," />
      <!--property name="quote" value="\" /-->
      <property name="comments" value="HeaderLineBegin" />
   </parser>
    …
</stream>