/html-table-to-json

:1234: Convert HTML Table to JSON using BeautifulSoup

Primary LanguagePython

html-table-to-json

Convert HTML Table (with no rowspan/colspan) to JSON using Python

Input

ID Vendor Product
1 Intel Processor
2 AMD GPU
3 Gigabyte Mainboard

Output

[
    {
        "product": "Processor", 
        "vendor": "Intel", 
        "id": "1"
    }, 
    {
        "product": "GPU", 
        "vendor": "AMD", 
        "id": "2"
    }, 
    {
        "product": "Mainboard", 
        "vendor": "Gigabyte", 
        "id": "3"
    }
]

Input (Without Table Header)

1 Intel Processor
2 AMD GPU
3 Gigabyte Mainboard

Output

[
  [
    "1",
    "Intel",
    "Processor"
  ],
  [
    "2",
    "AMD",
    "GPU"
  ],
  [
    "3",
    "Gigabyte",
    "Mainboard"
  ]
]

Input

Date Transaction Description Debit/Cheque Credit/Deposit Balance
13 Jul 2022 SOME WORKPLACE
Salary
$3,509.30 OD $1,725.53
12 Jul 2022 ATM DEPOSIT
CARD 1605
$400.00 OD $5,234.83
11 Jul 2022 Another Transaction
Another Transaction
$104.00 OD $5,634.83
11 Jul 2022 MB TRANSFER
TO XX-XXXX-XXXXXXX-51
$4.50 OD $5,738.83

Output

[
    {
        "date": "13 Jul 2022",
        "transaction description": "SOME WORKPLACESalary",
        "debit/cheque": "",
        "credit/deposit": "$3,509.30",
        "balance": "OD $1,725.53"
    },
    {
        "date": "12 Jul 2022",
        "transaction description": "ATM DEPOSITCARD 1605",
        "debit/cheque": "",
        "credit/deposit": "$400.00",
        "balance": "OD $5,234.83"
    },
    {
        "date": "11 Jul 2022",
        "transaction description": "Another TransactionAnother Transaction",
        "debit/cheque": "",
        "credit/deposit": "$104.00",
        "balance": "OD $5,634.83"
    },
    {
        "date": "11 Jul 2022",
        "transaction description": "MB TRANSFERTO XX-XXXX-XXXXXXX-51",
        "debit/cheque": "$4.50",
        "credit/deposit": "",
        "balance": "OD $5,738.83"
    }
]

TODO

  • Support for nested table
  • Support for buggy HTML table (ie. td instead of th in thead)