topepo/caret

Issue with my logistic regression model with R

ZOLASERGE9 opened this issue · 0 comments

Hi! I run my logistic regression model but I get an error message. Here is the message through this screenshot:

Message error

or

Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :1 NA's :1
Error: Stopping
In addition: There were 26 warnings (use warnings() to see them)

Here are my data sources :

test.csv
train.csv
gender_submission.csv

Look at my codes :

`#The objective : Predict survival on the Titanic and get familiar with ML basics

#Step :
#Data import
#Data Cleaning
#Descrptives statistcs
#Creation of the model
#Estimating model quality

#---------------------------------------------------------------------------------
#------------------ Step 1 : Data import -----------------------------------------
#---------------------------------------------------------------------------------
#Download packages
library(readr) #Import data
library(dplyr) #Data manipulation
library(tidyr) #Data manipulation
library(ggplot2)
library(lattice)
library(caret) #Machine Learning
library(recipes) #Machine Learning

test <- read_csv("titanic/test.csv") #Data for building the model
train <- read_csv("titanic/train.csv") #Data for testing the model

#---------------------------------------------------------------------------------
#------------------ Step 2 : Data manipulation -----------------------------------
#---------------------------------------------------------------------------------

full <- bind_rows(train, test) #Gather into one dataset

head(full) #Visualize the five first data

sum(is.na(full)) # total number of missing data

colMeans(is.na(full)) # Percentage of missing values for each column

full<- full[!is.na(full$Embarked),] #Delete missing values from Embarked variable
full<- full[!is.na(full$Survived),] #Delete missing values from Survived variable

full[is.na(full$Age),]$Age <- median(full$Age, na.rm = T) #Replace missings values with the median of the column "Age"

#Select the data we save for the rest of the analysis

full <- full %>% select("Survived","Pclass", "Sex",
"Age", "SibSp", "Parch", "Fare", "Embarked")

#---------------------------------------------------------------------------------
#------------------ Step 3 : Creation of the model -------------------------------
#---------------------------------------------------------------------------------

set.seed(222)#Set up a random seed

#Redivide the data in train(75%) et test

smp_size <- floor(0.75 * nrow(full))
train_ind <-sample(seq_len(nrow(full)), size = smp_size)

train <- full[train_ind,] #Filter of each row which are in "train_ind"
test <- full[-train_ind,] #Filter of each row which aren't in "train_ind"

#---------------------------------------------------------------------------------
#------------------ Step 4 : Creation of the model -------------------------------
#---------------------------------------------------------------------------------

fitControl <- trainControl(method="cv", number=10, savePredictions = TRUE) #Parameters of the module
Survived <- full$Survived
Survived <- as.factor(Survived)
lr_model <- train(Survived~ .,
data = train,
method = "glm",
family = binomial(),
trainControl = fitControl)

summary(lr_model`

Thank you so much for your answers and supports.