CsvReader fails to parse file with some quoted fields
oflege opened this issue · 3 comments
Describe the bug
CsvReader with errorOnDifferentFieldCount(true) fails for a particular csv file. If I change anything in that file before the problematic line 228 (delete a row, add/remove a char from a field in a row) or replace the double quotes in line 228 with single quotes, the CsvReader does not fail.
To Reproduce
JUnit test to reproduce the behavior:
try (CsvReader r = CsvReader.builder().fieldSeparator(';').errorOnDifferentFieldCount(true)
.build(new File("a.csv").toPath(), StandardCharsets.ISO_8859_1)
) {
r.iterator().forEachRemaining(System.out::println);
}
Thanks for reporting this issue! Given your test code and data I could successfully reproduce and fix it.
The problem was caused by the combination of two things:
- Quote character within an unquoted field (nonconforming data per section 2.5 of RFC 4180)
- Need to refill the input buffer while parsing such a field
Could you give the develop branch a try if it fixes your problem with real data?
Thanks a lot for the quick fix, I just tested the code in the develop branch with our curent set of csv files and all were parsed successfully
Thanks! Fixed in 2.2.1 just released.