The purpose of this analysis is to create a binary classification model using deep learning techniques to predict if an organization funded by Alphabet Soup will be successful in their venture. The model utilizes a dataset of over 34,000 organizations that have received funding from Alphabet Soup, containing metadata about each organization.
-
Target variable(s) for the model: The target variable for the model is
IS_SUCCESSFUL
. -
Feature variable(s) for the model: The feature variables for the model include
APPLICATION_TYPE
,AFFILIATION
,CLASSIFICATION
,USE_CASE
,ORGANIZATION
,STATUS
,INCOME_AMT
,SPECIAL_CONSIDERATIONS
, andASK_AMT
. -
Variable(s) removed from the input data: The
EIN
andNAME
columns were removed from the input data as they are identification columns and not useful as features or targets. -
Feature variable
NAME
has been brought back in the last model.
- Neurons, layers, and activation functions selected for the neural network model and rationale: The model consists of three hidden layers with 80, 30, and 1 neurons, respectively, and ReLU activation functions. The output layer uses a sigmoid activation function for binary classification. The structure was chosen to provide a balance between complexity and the potential for overfitting, while maintaining the ability to learn complex patterns in the data.
In this project, we ran about 4 models. The first 3 models removed the EIN and NAME columns and with applying difirrent neurons and layers and binning as bellow:
- Attempt1: Use same structure but different number of nurons in each layer and increasing the Epoch from 50 t0 100.
- Accuracy increased very slightly to 72.7%
- Attempt 2: Optimising the structure using the Keras Tuner
- Allow activation function to choose between relu, sigmoid, tanh
- sigmoid is still the only option for the final layer
- Allow number of neurons to vary from 6 to ~75
- Accuracy increased to 73.1%
- Attempt 3:
Try using few neurons (< number of features) with sigmoid activation function for non input layer
- Accuracy decreased to 72.6%
- Attempt4: the final attempt that can be find in the Final-AlphabetSoupCharity_Optimisation.ipynb has the best accuracy of 78.5%.
Our Feature variable(s) for the model: The feature variables for the model include APPLICATION_TYPE
, AFFILIATION
, CLASSIFICATION
, USE_CASE
, ORGANIZATION
, STATUS
, INCOME_AMT
, SPECIAL_CONSIDERATIONS
, NAME
and ASK_AMT
. and Target variable(s) for the model: The target variable for the model is IS_SUCCESSFUL
. The model consists of three hidden layers with 14, 7, and 1 neurons, respectively, The output layer uses a sigmoid activation function for binary classification and used Relu for other layers.
-
Achievement of the target model performance: The model did achieve the target performance of 78.5% accuracy. However, multiple attempts were made to optimize the model, including adjusting input data, modifying the structure of the neural network, and modifying the training regimen.
-
Steps taken in attempts to increase model performance: To increase model performance, the following steps were taken:
- Dropping additional irrelevant columns from the input data.
- Creating more bins for rare occurrences in columns and adjusting the number of values for each bin.
- Adding more neurons to a hidden layer.
- Adding or removing hidden layers.
- Using different activation functions for the hidden layers.
- Increasing or decreasing the number of epochs in the training regimen.
The deep learning model achieved the desired performance of 78.5% accuracy in predicting the success of Alphabet Soup-funded organizations. Several attempts were made to optimize the model through data preprocessing and neural network structure adjustments, ultimately leading to this improved performance.