/Kaggle-submissions

This repository contains the my machine learning submissions made on Kaggle

Primary LanguageJupyter Notebook

Kaggle-submissions

This repository contains the my machine learning submissions made on Kaggle.

Motivation

Data Science has always enticed me since the very beginning of my journey as a Computer Science undergraduate.The incredible power of Python as a programming language to read, process,manipulate and visualize a humongous amount of data is paramount.Hence, I was also inclined to using Python as a Data Science Language to visualizing data as well as making predictions by fitting Machine Learning models on training data.
After doing some courses on Data Science and Machine Learning from Udemy and Coursera, I wanted to really put my skills to the test and hence bumped into Kaggle after learning about it from a friend.

Getting Started on Kaggle

Competition 1: Titanic : Machine Learning from Disaster

Getting started on Kaggle can be a bit daunting. However, after gaining the knowledge of the basic regression techniques,one should be able to go in for The Titanic:Machine Learning from Disaster Competition, the link for which is given below
Titanic:Machine Learning from Disaster.
k

The Dataset

The dataset for the Titanic Competition is divided into 3 files:

  • The gender_submission.csv file.
  • The train.csv file.
  • The test.csv file.

The `gender_submission.csv` file conatins what the resultant output file needs to look like. The `train.csv` file conatins the training data whereas the `test.csv` contains the test data.

Accuracy of the Model and Model used

The model used for training the dataset is built on Logistic Regression in python. The model got me to the top 73 % on the leaderboard.Still improving !!

Competition 2: House Prices: Advanced Regression Techniques

This is my second competition on Kaggle. I solved this problem using Linear Regression techniques. This is again, an introductory problem on Kaggle.The link for this problem can be found below: House Prices: Advanced Regression Techniques

The Dataset

The dataset for the Titanic Competition is divided into 3 files:

  • The data_description.txt file- that describes all the features in the training ans test data
  • The train.csv file.
  • The test.csv file. Also, a sample_submission file is provided. My output is contained in house_regression_analysis.ipynb file and the final submitted file is the submission.csv file.

Accuracy and model used

The model used for training the dataset is built on Linear Regression in python. The model got me to the top 82 % on the leaderboard.Still improving !!
My submission was ranked as follows: my submission