Data Preprocessing CLI

This is a command-line interface (CLI) tool for performing various data preprocessing tasks. It provides functionalities for handling missing data, encoding categorical variables, and scaling features. Additionally, it can generate plots to visualize the data.

Usage

Missing Data Handling


python cli.py missing --input-file data/input_data.csv --output-file data/output_data.csv

Categorical Variable Encoding


python cli.py encode --input-file data/input_data.csv --output-file data/encoded_data.csv --strategy onehot --columns Country,Purchased

Feature Scaling

Standard Scaling


python cli.py scale --input-file data/encoded_data.csv --output-file data/scaled_data.csv --scaler standard --columns 1,2

Min-Max Scaling


python cli.py scale --input-file data/input_data.csv --output-file data/scaled_data.csv --scaler minmax --columns 1,2

Data Visualization


python cli.py generate_plots --input-file input_data.csv --output-file output_plot.png --strategy heatmap --columns column1 column2

How to Use

  1. Clone this repository to your local machine.
  2. Navigate to the project directory in your terminal.
  3. Run the desired commands from the examples provided above.

Dependencies

Acknowledgments

Special thanks to the contributors and the open-source community for their valuable contributions.