/Kaggle_EDA

Data visualization and EDA projects for different datasets on Kaggle

Primary LanguageJupyter Notebook

Kaggle_EDA

Here I did a number of data visualization examples on different datasets I found online. Each of them showcases various visualization techniques and plots. Datasets can be found in the links provided below if readers are interested in attempting on any of them.

Santander Customer Transaction Prediction - Kaggle Competition

In this challenge, Santander invites Kagglers to help them identify which customers will make a specific transaction in the future, irrespective of the amount of money transacted. The data provided for this competition has the same structure as the real data they have available to solve this problem.

The link to the dataset used is: Santander Kaggle Competition

2018 Kaggle ML & DS Survey

It is Kaggle's second annual Machine Learning and Data Science Survey ― and its first-ever survey data challenge. A survey data EDA provides an overview of the industry on an aggregate scale, but it also leaves us wanting to know more about the many specific communities comprised within the survey.

Therefore, this notebook aims to tell a rich story about a subset of the data science and machine learning community.

The link to the dataset used is: 2018 Kaggle ML and DS Survey

2019 Kaggle ML & DS Survey (Bronze Medal awarded)

It is Kaggle's third annual Machine Learning and Data Science Survey ― and its second survey data challenge. A deep insight of the job selection (big company or small startup) is given based on the users survey. This work has been awarded Bronze medal in the competition.

The link to the dataset used is: 2019 Kaggle ML and DS Survey

Install dependencies

If you want to run all the notebooks in this repository, you could run the code below in your terminal to install all needed packages:

pip install -r requirements.txt