/Neural_Network_Charity_Analysis

Creation of a binary classifier used to predict the success rate of applicants when funded by a specific company.

Primary LanguageJupyter Notebook

Neural_Networks

Overview of The Loan Prediction Risk Analysis

Summary Analysis

The purpose of this analysis is to create a Neural Network Binary Classification capable of predicting whether applicants will be successful if funded by Alphabet Soup. In order to achieve this, a CSV file containing more than 34,000 organizations that have previously received funding from Alphabet Soup over the years is analyzed to evaluate and train data which will then serve in making a decision regarding successful applicants.

Results

Data Preprocessing

The variable considered to be the target for the model is "IS_SUCCESSFUL".

The features of the model Include:

  • Affiliation
  • Classification
  • Use_Case
  • Organization
  • Income_AMT
  • Special_Considerations
  • Status
  • Ask_AMT
  • Application_Type

Unnecessary variables such as "EIN" and "NAME" were removed from the data, as they are neither targets nor features.

Compiling, Training, and Evaluating the Model

  • In order to achieve a better performance and to create more complex structures in the data, the neural network used Relu and its nonlinear activation functions. This explains the reason behind the selection of 8 neurons in the first layer, 6 in the second layer, and the output layer having a sigmoid activation function.

image

image

With the optimization of the transformed data, the "STATUS" and "SPECIAL_CONSIDERATIONS" columns were removed in order to attempt to increase the model performance.The first layer of the neurons were increased from 8 to 14 and the second layer from 6 to 8. The classification count to replace values was also decreased and a callback was created to save the model's weights every 4 to 5 epochs.

image

image

Summary and Recommendation

Despite the measures taken, the target model performance of 75% accuracy was not achieved and there was little to no difference in the accuracy between the pre-optimized and the post-optimized model.

This could be explained by a number of factors such as:

  • A wrong activation function may have been used
  • The number of neurons used for the first and second layer needed to be changed
  • The deletion of columns may have significantly altered the result (if important columns were removed)
  • Different activation functions would require further testing to determine which one would be the most suitable to increase the accuracy level of the model

However, the training loss appears to have decreased from 1.02 to 0.75.

As a recommendation, a proper number of neurons as well as the right activation function need to be elaborated and chosen accurately with regards to the data in order to have a greater accuracy result.