digital-preservation/csv-validator

Very slow response when the validator finds a big number of errors

Closed this issue · 5 comments

A CSV file of 3605 lines and 114 columns with complex regex rules for most of the columns is parsed very fast when there are no errors (through the Java API). The performance falls steeply if I introduce a simple error in every row of the file (change one letter of a country 3-letter code in one column for example). It is evident that this has not to do with parsing the rows and the columns but with writing the FailMessages. It would be great if this could be alleviated.

Could you try this on the new 1.3.0 release and see if you still have the issue: https://github.com/digital-preservation/csv-validator/releases/tag/1.3.0

I guess you know about the 'Fail on First Error' setting which we tend to use, to get a quick response when errors are encountered.

Yes, v1.3.0 has speed improvements when writing errors to the application window.

I tested v1.3.0. The speed for the production of the errors report has been greatly improved.