Bug: duplicate error lines after upgrading to v3.0
BrendonPierson opened this issue · 2 comments
I had a unit test checking the proper storing of errors that broken post upgrade from 2.5 to 3.0. The test tries to parse a file with known stray quote errors on line 2 and 4:
Jerry,31,comedian
Elaine,28,"unknwon ""
Kramer,40,kramerica
George,34,architect "vandalay" ind
Newman,37,"postal, worker"
Running the following I expect two errors (row 2 and 4) and 3 good rows (row 1,3,5).
File.stream!(three_good_two_bad.csv) |> CSV.decode |> Enum.map(& &1)
Instead I get 3 errors:
[
ok: ["Jerry", "31", "comedian"],
error: "Stray escape character on line 4:\n\n\"\nKramer,40,kramerica\nGeorge,34,architect \"vandalay\" ind\n\nThis error often happens when the wrong separator or escape character has been applied.\n",
ok: ["Kramer", "40", "kramerica"],
error: "Stray escape character on line 4:\n\narchitect \"vandalay\" ind\n\nThis error often happens when the wrong separator or escape character has been applied.\n",
error: "Stray escape character on line 5:\n\narchitect \"vandalay\" ind\n\nThis error often happens when the wrong separator or escape character has been applied.\n",
ok: ["Newman", "37", "postal, worker"]
]
I wouldn't expect multiple error tuples for a single row, but also the error tuple messages list lines 4, 4, and 5 when I would expect the errors to be listed on lines 2 and 4.
Thanks for the help in advance!
Thanks for raising this, reparsing included the last line in this scenario. It is fixed in 3.0.3
As an aside, the first error is indicated as line 4 because line 2 starts a valid escape sequence and also escapes the quote at the end correctly by doubling it. The stray escape character happens at "vandalay
in the sequence.
Ah that makes sense on the line number. Thanks for tackling this so fast!