
Introduce BOM for Microsoft applications

theexiile1305 opened this issue · 10 comments

Hey there,

thank you very much for this gerat project.

Microsoft applications, for some reason, seem to require a BOM to parse for example UTF-8 files correctly, even though there is no byte order in UTF-8 like there is in 16/32. In order to open a created csv file correctly I suggest to add this special BOM (UTF-8 does require three special bytes 0xEF, 0xBB and 0xBF at the start of the file), even though the csvWriter is configured with the Charsets.UTF_8.name().

Why this is undocumented and why Excel seems to require a BOM for UTF-8 I don't know; might be good questions for Excel team at Microsoft.

What do you think or do you have any suggestion to solve this problem?

Thank you for the question. Can you elaborate on this?
Is your problem something like the following?
"CSV files written by kotlin-csv don't have a BOM, so it cannot be read by Excel."

@doyaaaaaken Thank you for your quick response. Yes of course, I can elaborate on this with the following example:
The csv file can be successfully created like with enabled UTF-8 setting


If I open this file Google Spreadsheet or Numbers (macOS spreadsheet application), then Müller is displayed correct. Inc contrast, Müller ist represented as M√ºller in Excel. In the further analysis it was noticed that all UTF-8 special characters (e.g. öäüÄÖÜß - the special german characters) are not displayed correctly in Excel.

The situation you described has been successfully reproduced by this code, thanks.

        csvWriter().open("test.csv") {

So, I plan to introduce an includeBOM: Boolean option on CsvWriterContext.
You can use this option like the below snippet.
Do you think this is ok?

    includeBOM = true
}.open("test.csv") {
  //do some operation

Sorry for the late response. The above snippet looks gerat and it's okay for me. Thank you!

If you want, I can give a try on that issue. 😄

@theexiile1305 Thanks! Please try it.

@theexiile1305: As a workaround, you can also import the csv file by Data | From Text/CSV instead of just opening it. This has the advantage that you can explicitly select the source file encoding in the import dialog:


hey @doyaaaaaken, has this been resolved?

Hi @EthanDunfordAspect , this has not been resolved yet.

released in v1.9.0 🚀