/CRISP-DM-for-Recruitment-Data

In this project, I applied the CRISP-DM process to a dataset containing campus Recruitment data from an MBA collge. The dataset can be found on kaggle :

Primary LanguageJupyter Notebook

CRISP-DM-for-Recruitment-Data

Table of Contents

  1. Installation
  2. Project Motivation
  3. File Descriptions
  4. Results
  5. Licensing, Authors, and Acknowledgements

Installation

There should be no necessary libraries to run the code here beyond the Anaconda distribution of Python. Remember to first download the dataset, here. The code should run with no issues using Python versions 3.*.

Project Motivation

Campus Recruitment is an obstacle that almost all Engineering students face at some point in their lives. As a final year Computer Science Student, when I came across a dataset titled Campus Recruitment on kaggle, I was instantly drawn to it, in hopes of not only understanding the general trend in the industry but also of reassuring myself that I was not a lost cause. Although this dataset is from an MBA college, I think it can still be used to extract valuable information about how ones academic choices can impact their placements

Questions explored:

  1. Does the board of education affect placements?
  2. Does it really matter how much you score in your school days?
  3. Is one stream inherently better than the other?
  4. Which degree and MBA specialization has the highest Salary?
  5. Does gender bias exist in campus recruitment?

The primary motivation for this project in general was to apply the knowlege I gained regarding the CRISP-DM process to a real world dataset, as a part of my Introduction to Data Science Nanodegree (Udacity, 2020).

File Descriptions

The python notebook contains all the code written to perform some exploratory data analysis and run a simple Linear SVM model on the dataset. The README.md file serves as an entry point for the project to inform the readers about what they can expect from the notebook.

Results

The key insights obtained from this analysis have been recorded on my medium post here. I would greatly appreciate any and all feedback, suggestions and opinions regarding my approach towards this project in the form of post comments. If you liked my work, why not share it with a friend or two? It would help keep me motivated towards my future work. Thanks!

Licensing, Authors, Acknowledgements

You can find the Licensing for the data and other descriptive information at the Kaggle link available here. Otherwise, feel free to use the code here as you would like!