How to do a simple end-to-end machine learning classification project using the telco customer churn dataset.
In machine learning, classification is a supervised method of segmenting data points into various labels or classes. Unlike regression, the target variable in a classification problem is discrete. Each data point used in training classification models must have a corresponding label in order for the characteristics and patterns in the classes to be learnt appropriately. Classification can either be binary - identifying that a given email is spam or not or, multi-class - classifying a fruit as orange, mango or banana.
This project is a binary classification problem.
Each row represents a customer, each column contains customer’s attributes described on the column Metadata.
The data set includes information about:
- Customers who left within the last month – the column is called Churn
- Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies
- Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges
- Demographic info about customers – gender, age range, and if they have partners and dependents
You need to have Python 3
on your system. Then you can clone this repo and being at the repo's root :: repository_name> ...
- Clone this repository:
git clone https://github.com/Azie88/Machine-Learning-Classification-Review
- On your IDE, create A Virtual Environment and Install the required packages for the project:
-
Windows:
python -m venv venv; venv\Scripts\activate; python -m pip install -q --upgrade pip; python -m pip install -qr requirements.txt
-
Linux & MacOs:
python3 -m venv venv; source venv/bin/activate; python -m pip install -q --upgrade pip; python -m pip install -qr requirements.txt
The two long command-lines have the same structure. They pipe multiple commands using the symbol ;
but you can manually execute them one after the other.
- Create the Python's virtual environment that isolates the required libraries of the project to avoid conflicts;
- Activate the Python's virtual environment so that the Python kernel & libraries will be those of the isolated environment;
- Upgrade Pip, the installed libraries/packages manager to have the up-to-date version that will work correctly;
- Install the required libraries/packages listed in the
requirements.txt
file so that they can be imported into the python script and notebook without any issue.
NB: For MacOs users, please install Xcode
if you have an issue.
- Explore the Jupyter notebook for detailed steps and code execution.
Andrew Obando
Feel free to star ⭐ this repository if you find it helpful!