Pinned Repositories
Big-Mart--Sales--Practice-Problem-
The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in different cities. Also, certain attributes of each product and store have been defined. The aim is to build a predictive model and find out the sales of each product at a particular store. Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales.
Big-Mart-Sales-Practice-Problem
The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in different cities. Also, certain attributes of each product and store have been defined. The aim is to build a predictive model and find out the sales of each product at a particular store. Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales.
Big-Mart-Sales-Practice-Problem-
The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in different cities. Also, certain attributes of each product and store have been defined. The aim is to build a predictive model and find out the sales of each product at a particular store. Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales.
Insurance-Case-Study-Regression-Tree-using-CART
In the insurance industry, a lot of times an Insurance company would like to assess what kind of claims, in terms of monetary value, their consumers make. This usually helps insurance companies evaluate the premiums being offered and the claims being made by the consumers. Once the insurance company has these details, they would be able calculate the Losses they may incur from each of the consumer. This case study is going to help an insurance company, which is into Motor Insurance, build a statistical model that in turn would help them in assessing their consumer base.
Machine-Learning--Housing-Problem
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence.
Titenic
1 Introduction 1.1 Load and Check data 2. Data Mining 2.1. Missing Value Tretment 2.2. Outlier Tretment 3. Freature Engineering 3.1 Pessenger Name 3.2 Identify Family Size 3.3 Introduce new variable 3.4 Creat Dummy Variable. 4. Model 4.1 Split into Train, Validation and Testsets 4.2 Building the Model 4.3 Predict on Validation Dataset 4.4 Confusion Matrix 4.5 ROC 4.6 Prediction 5 .Conclusion 1.Introduction This is my first step at a Kaggle script. I have chosen to work with Titanic dataset . I will do data mining and Freature Engineering to find few independent variable.I will use Logistic Regression to create a model predicting survival on the the Titanic. I am new to Machine Learning and hoping to learn a lot,So, feedback is very welcome. 1.1Load and Check Data #Package library('ggplot2') # visualization library('ggthemes') # visualization ibrary('scales') # visualization library(dplyr) # data manipulation library(glmnet)#model creation library(ROCR) #Roc curve let’s read in and take a peek at the data. >train <- read.csv("..............\\train.csv", stringsAsFactors = F) >test <- read.csv(".................\\test.csv", stringsAsFactors = F) #Cheack Data >str(train) data.frame': 891 obs. of 12 variables: $ PassengerId: int 1 2 3 4 5 6 7 8 9 10 ... $ Survived : int 0 1 1 1 0 0 0 0 1 1 ... $ Pclass : int 3 1 3 1 3 3 1 3 3 2 ... $ Name : chr "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ... $ Sex : chr "male" "female" "female" "female" ... $ Age : num 22 38 26 35 35 NA 54 2 27 14 ... $ SibSp : int 1 1 0 1 0 0 0 3 0 1 ... $ Parch : int 0 0 0 0 0 0 0 1 2 0 ... $ Ticket : chr "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ... $ Fare : num 7.25 71.28 7.92 53.1 8.05 ... $ Cabin : chr "" "C85" "" "C123" ... $ Embarked : chr "S" "C" "S" "S" ... >str(test) 'data.frame': 418 obs. of 11 variables: $ PassengerId: int 892 893 894 895 896 897 898 899 900 901 ... $ Pclass : int 3 3 2 3 3 3 3 2 3 3 ... $ Name : chr "Kelly, Mr. James" "Wilkes, Mrs. James (Ellen Needs)" "Myles, Mr. Thomas Francis" "Wirz, Mr. Albert" ... $ Sex : chr "male" "female" "male" "male" ... $ Age : num 34.5 47 62 27 22 14 30 26 18 21 ... $ SibSp : int 0 1 0 0 1 0 0 1 0 2 ... $ Parch : int 0 0 0 0 1 0 0 1 0 0 ... $ Ticket : chr "330911" "363272" "240276" "315154" ... $ Fare : num 7.83 7 9.69 8.66 12.29 ... $ Cabin : chr "" "" "" "" ... $ Embarked : chr "Q" "S" "Q" "S" ... Here We can see the even independed variable. Data Dictionary :- Variable Name Description Survived Survived (1) or died (0) Pclass Passenger’s class Name Passenger’s name Sex Passenger’s sex Age Passenger’s age SibSp Number of siblings/spouses aboard Parch Number of parents/children aboard Ticket Ticket number Fare Fare Cabin Cabin Embarked Port of embarkation
hinalba15's Repositories
hinalba15/Insurance-Case-Study-Regression-Tree-using-CART
In the insurance industry, a lot of times an Insurance company would like to assess what kind of claims, in terms of monetary value, their consumers make. This usually helps insurance companies evaluate the premiums being offered and the claims being made by the consumers. Once the insurance company has these details, they would be able calculate the Losses they may incur from each of the consumer. This case study is going to help an insurance company, which is into Motor Insurance, build a statistical model that in turn would help them in assessing their consumer base.
hinalba15/Big-Mart--Sales--Practice-Problem-
The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in different cities. Also, certain attributes of each product and store have been defined. The aim is to build a predictive model and find out the sales of each product at a particular store. Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales.
hinalba15/Big-Mart-Sales-Practice-Problem
The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in different cities. Also, certain attributes of each product and store have been defined. The aim is to build a predictive model and find out the sales of each product at a particular store. Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales.
hinalba15/Big-Mart-Sales-Practice-Problem-
The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in different cities. Also, certain attributes of each product and store have been defined. The aim is to build a predictive model and find out the sales of each product at a particular store. Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales.
hinalba15/Machine-Learning--Housing-Problem
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence.
hinalba15/Titenic
1 Introduction 1.1 Load and Check data 2. Data Mining 2.1. Missing Value Tretment 2.2. Outlier Tretment 3. Freature Engineering 3.1 Pessenger Name 3.2 Identify Family Size 3.3 Introduce new variable 3.4 Creat Dummy Variable. 4. Model 4.1 Split into Train, Validation and Testsets 4.2 Building the Model 4.3 Predict on Validation Dataset 4.4 Confusion Matrix 4.5 ROC 4.6 Prediction 5 .Conclusion 1.Introduction This is my first step at a Kaggle script. I have chosen to work with Titanic dataset . I will do data mining and Freature Engineering to find few independent variable.I will use Logistic Regression to create a model predicting survival on the the Titanic. I am new to Machine Learning and hoping to learn a lot,So, feedback is very welcome. 1.1Load and Check Data #Package library('ggplot2') # visualization library('ggthemes') # visualization ibrary('scales') # visualization library(dplyr) # data manipulation library(glmnet)#model creation library(ROCR) #Roc curve let’s read in and take a peek at the data. >train <- read.csv("..............\\train.csv", stringsAsFactors = F) >test <- read.csv(".................\\test.csv", stringsAsFactors = F) #Cheack Data >str(train) data.frame': 891 obs. of 12 variables: $ PassengerId: int 1 2 3 4 5 6 7 8 9 10 ... $ Survived : int 0 1 1 1 0 0 0 0 1 1 ... $ Pclass : int 3 1 3 1 3 3 1 3 3 2 ... $ Name : chr "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ... $ Sex : chr "male" "female" "female" "female" ... $ Age : num 22 38 26 35 35 NA 54 2 27 14 ... $ SibSp : int 1 1 0 1 0 0 0 3 0 1 ... $ Parch : int 0 0 0 0 0 0 0 1 2 0 ... $ Ticket : chr "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ... $ Fare : num 7.25 71.28 7.92 53.1 8.05 ... $ Cabin : chr "" "C85" "" "C123" ... $ Embarked : chr "S" "C" "S" "S" ... >str(test) 'data.frame': 418 obs. of 11 variables: $ PassengerId: int 892 893 894 895 896 897 898 899 900 901 ... $ Pclass : int 3 3 2 3 3 3 3 2 3 3 ... $ Name : chr "Kelly, Mr. James" "Wilkes, Mrs. James (Ellen Needs)" "Myles, Mr. Thomas Francis" "Wirz, Mr. Albert" ... $ Sex : chr "male" "female" "male" "male" ... $ Age : num 34.5 47 62 27 22 14 30 26 18 21 ... $ SibSp : int 0 1 0 0 1 0 0 1 0 2 ... $ Parch : int 0 0 0 0 1 0 0 1 0 0 ... $ Ticket : chr "330911" "363272" "240276" "315154" ... $ Fare : num 7.83 7 9.69 8.66 12.29 ... $ Cabin : chr "" "" "" "" ... $ Embarked : chr "Q" "S" "Q" "S" ... Here We can see the even independed variable. Data Dictionary :- Variable Name Description Survived Survived (1) or died (0) Pclass Passenger’s class Name Passenger’s name Sex Passenger’s sex Age Passenger’s age SibSp Number of siblings/spouses aboard Parch Number of parents/children aboard Ticket Ticket number Fare Fare Cabin Cabin Embarked Port of embarkation