This is a command-line interface (CLI) tool for performing various data preprocessing tasks. It provides functionalities for handling missing data, encoding categorical variables, and scaling features. Additionally, it can generate plots to visualize the data.
python cli.py missing --input-file data/input_data.csv --output-file data/output_data.csv
python cli.py encode --input-file data/input_data.csv --output-file data/encoded_data.csv --strategy onehot --columns Country,Purchased
python cli.py scale --input-file data/encoded_data.csv --output-file data/scaled_data.csv --scaler standard --columns 1,2
python cli.py scale --input-file data/input_data.csv --output-file data/scaled_data.csv --scaler minmax --columns 1,2
python cli.py generate_plots --input-file input_data.csv --output-file output_plot.png --strategy heatmap --columns column1 column2
- Clone this repository to your local machine.
- Navigate to the project directory in your terminal.
- Run the desired commands from the examples provided above.
- Python (>=3.6)
- Pandas
- Matplotlib
Special thanks to the contributors and the open-source community for their valuable contributions.