wrong CSV separators, starting from HTML table that has commas inside cells
aborruso opened this issue · 5 comments
aborruso commented
Hi,
if I run tabulator input.html
using the below html table, I have
RNDFNC60E16,RIPACANDIDA,85020,POTENZA,250,00
RNDFNC60E16,,,POTENZA,250,00
and not
RNDFNC60E16,RIPACANDIDA,85020,POTENZA,"250,00"
RNDFNC60E16,,,POTENZA,"250,00"
Thank you
<!DOCTYPE html>
<html>
<body>
<table id="results" border="0" class="regpub_dati c35">
<tbody>
<tr class="c28">
<th class="c27">Beneficiario</th>
<th class="c27">Comune</th>
<th class="c27">CAP</th>
<th class="c27">Provincia </th>
<th class="c27">Importo</th>
</tr>
<tr>
<td class="c31">RNDFNC60E16</td>
<td class="c31">RIPACANDIDA</td>
<td class="c31">85020</td>
<td class="c31">POTENZA</td>
<td class="c34">250,00</td>
</tr>
<tr>
<td class="c31">RNDFNC60E16</td>
<td class="c31"></td>
<td class="c31"></td>
<td class="c31">POTENZA</td>
<td class="c34">250,00</td>
</tr>
</tbody>
</table>
</body>
</html>
Please preserve this line to notify @roll (lead of this repository)
roll commented
Hi @aborruso,
It's only because it's just printed to the console.
from tabulator import Stream
with Stream('tmp/issue324.html') as stream:
stream.save('tmp/issue324.csv')
This one will give you a proper:
RNDFNC60E16,RIPACANDIDA,85020,POTENZA,"250,00"
RNDFNC60E16,,,POTENZA,"250,00"
roll commented
It's not supported yet.
Would you like to create a feature request?
roll commented
It's kind mixed - it uses bold
for headers and just a simple comma-delimited output for rows