Add QuoteStrategy parameter in CsvReader to handle empty strings vs null values
OlivierLevrey opened this issue · 2 comments
QuoteStrategy.EMPTY is convenient if I want to differenciate empty strings from null values in the output file.
However there is no such parameter in CsvReader which means I cannot read back the original data.
Below is a unit test showing this:
/**
* Writes a single row of special values, reads back the file, and tests
* that read values exactly match the original values.
*/
@Test
public void test() throws IOException {
String[] values = new String[]{
"Simple text",
"Multiline\ntext",
// a string containing a comma
"1,2",
// a string with double quotes
"\"Hello\"",
// a string containing a single character: a double quote
"\"",
// an empty string
"",
// a null value
null
};
File tmp = new File("C:/tmp/csv.txt");
// write the csv file
try (CsvWriter csv = CsvWriter.builder()
.quoteStrategy(QuoteStrategy.EMPTY)
.build(tmp.toPath(), StandardCharsets.UTF_8)) {
csv.writeRow(values);
}
// read back the file
String[] readValues = null;
try (CsvReader csv = CsvReader.builder()
.skipEmptyRows(true)
.build(tmp.toPath(), StandardCharsets.UTF_8)) {
for (CsvRow row : csv) {
readValues = new String[row.getFieldCount()];
for (int i = 0; i < readValues.length; i++) {
readValues[i] = row.getField(i);
}
}
}
Assert.assertNotNull(readValues);
// this fails because of the null value read back as an empty string
Assert.assertArrayEquals(values, readValues);
}
}
It would be very nice to have the QuoteStrategy parameter in the reader.
Thanks for your feedback!
The difference between reading and writing is:
- Java has support for differentiating between null and empty string
- CSV (per RFC) does not allow differentiation – it's simply a blank field and the developer can decide how to handle it
When creating the API of FastCSV, I decided to design a Null-free API (see features). That way I can ensure, no one gets a NullPointerException when working with the API of this library.
Hence this design decision, I don't plan to offer a mechanism to read/return null values – not even with an optional strategy. If you really want to have nulls in your code, I'm afraid you have to add/convert them in your application code. Hope you can understand.
OK thank you for your quick reply.