This project provides a command-line tool to convert HTML tables to CSV files. The script reads an HTML file containing tables and extracts relevant data, which is then saved to a CSV file. This script is specifically tailored to extract pertient information from material list reports generated by Eagle Metal software.
- Parses HTML files and extracts table data.
- Aggregates and cleans the data.
- Converts lineal footage measurements from feet and inches to inches for aggregation, then back to feet and inches.
- Outputs the cleaned and aggregated data to a CSV file.
- Python 3.x
beautifulsoup4
librarypandas
library
To use the pre-built executable, follow these steps:
- Download the
html_to_csv.exe
file from thedist
directory. - Place the executable in the desired directory on your Windows machine.
If you prefer to build the executable from the source code, follow these steps:
-
Clone the repository:
git clone https://github.com/dnakitare/html-to-csv.git cd html-to-csv
-
Install the required libraries:
pip install -r requirements.txt
-
Create the executable using PyInstaller:
pyinstaller --onefile html_to_csv.py
The executable will be generated in the
dist
directory.
-
Open Command Prompt.
-
Navigate to the directory containing the executable and your HTML file.
-
Run the executable with the HTML file as an argument:
html_to_csv.exe input.html
-
The script will generate a CSV file with the same name as the input file, but with a
.csv
extension.
-
Open Command Prompt.
-
Navigate to the directory containing the Python script and your HTML file.
-
Run the script with the HTML file as an argument:
python html_to_csv.py input.html
-
The script will generate a CSV file with the same name as the input file, but with a
.csv
extension.
Given an HTML file sample.html
:
<!DOCTYPE html>
<html>
<head>
<title>Sample HTML</title>
</head>
<body>
<table>
<tr>
<th>SKU</th>
<th>Name</th>
<th>Quantity</th>
</tr>
<tr>
<td>12345</td>
<td>Wood Plank</td>
<td>10</td>
</tr>
<tr>
<td>67890</td>
<td>Plywood</td>
<td>5</td>
</tr>
</table>
</body>
</html>
Running the command:
html_to_csv.exe sample.html
Will generate a CSV file sampel.csv with the following content:
Lumber Size,Quantity,Board Footage,Lineal Footage
Wood Plank,10,,10' 0"
Plywood,5,,5' 0"
Distributed under the Apache-2.0 license. See LICENSE for more information.
Contributions are welcome! Please feel free to submit a Pull Request.
If you encounter any issues, please open an issue on GitHub.
This project uses the following libraries: