Machine-Learning-Classification-Review 🤷‍♂️

How to do a simple end-to-end machine learning classification project using the telco customer churn dataset.

In machine learning, classification is a supervised method of segmenting data points into various labels or classes. Unlike regression, the target variable in a classification problem is discrete. Each data point used in training classification models must have a corresponding label in order for the characteristics and patterns in the classes to be learnt appropriately. Classification can either be binary - identifying that a given email is spam or not or, multi-class - classifying a fruit as orange, mango or banana.

This project is a binary classification problem.

Dataset 💾

Kaggle Link

Each row represents a customer, each column contains customer’s attributes described on the column Metadata.

The data set includes information about:

Customers who left within the last month – the column is called Churn
Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies
Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges
Demographic info about customers – gender, age range, and if they have partners and dependents

How to Use The Repository

You need to have Python 3 on your system. Then you can clone this repo and being at the repo's root :: repository_name> ...

Clone this repository: git clone https://github.com/Azie88/Machine-Learning-Classification-Review
On your IDE, create A Virtual Environment and Install the required packages for the project:

Windows:

  python -m venv venv; 
  venv\Scripts\activate; 
  python -m pip install -q --upgrade pip; 
  python -m pip install -qr requirements.txt

Linux & MacOs:

  python3 -m venv venv; 
  source venv/bin/activate; 
  python -m pip install -q --upgrade pip; 
  python -m pip install -qr requirements.txt

The two long command-lines have the same structure. They pipe multiple commands using the symbol ; but you can manually execute them one after the other.

Create the Python's virtual environment that isolates the required libraries of the project to avoid conflicts;
Activate the Python's virtual environment so that the Python kernel & libraries will be those of the isolated environment;
Upgrade Pip, the installed libraries/packages manager to have the up-to-date version that will work correctly;
Install the required libraries/packages listed in the requirements.txt file so that they can be imported into the python script and notebook without any issue.

NB: For MacOs users, please install Xcode if you have an issue.

Explore the Jupyter notebook for detailed steps and code execution.

Author ✍️

Andrew Obando

Feel free to star ⭐ this repository if you find it helpful!

Azie88/Machine-Learning-Classification-Review

Machine-Learning-Classification-Review 🤷‍♂️

Dataset 💾

How to Use The Repository

Author ✍️