epam/parso

Expose CSVDataWriterImpl.processEntry for other output uses

Gagravarr opened this issue · 4 comments

Once #19 is done, we'd like to make use of Parso to provide a SAS7BDAT parser for Apache Tika (see TIKA-2462). In that parser, we'll want to get the "formatted" value as a string for each cell, then output that as SAX events for a HTML table

Currently, it seems that all the logic for turning column metadata + column number + raw value into a formatted string is hidden inside CSVDataWriterImpl, especially CSVDataWriterImpl.processEntry

It would be great if the logic for formatting as a string could be made available for re-use! Maybe by pulling it out to a helper class that CSVDataWriterImpl then uses?

Hi @Gagravarr, could you please check master branch? is this what you have expected?

Looks like DataWriterUtil.getRowValues should give us what we need for Tika, thanks!

Once there's a 2.0.9 we can go ahead and add the parser :)

2.0.9 is available in maven central so I'm closing the issue now. Please reopen if you need some assistance.

I've now integrated this with Apache Tika, and everything worked great, thanks!

It'll be included in Tika 1.19 (or 2.0 if we get that out first...)