Note: Training the Models with the Chosen Activation Function on the Hidden Layer in Each Independent Environment or Colab Notebook
Comparisons of how different chosen activation functions perform/behave and affects on a shallow neural network performances with constant architectural parameters in each dataset for multi-class image classification like MNIST Digits and CIFAR-10 using the selected evaluation metrics.
- MNIST Handwritten Digits
- Input Layer: 28,28,1
- Hidden Layer: 32
- Dropout Layer: 0.1
- Hidden Layer: 64
- Dropout Layer: 0.1
- Output Layer: 10
- CIFAR-10
- Input Layer: 32,32,3
- Hidden Layer:
- Dropout Layer: 0.1
- Hidden Layer:
- Dropout Layer: 0.1
- Output Layer: 10
- MNIST Handwritten Digits (Grayscale)
- CIFAR-10 (RGB)
Different Activation Functions for the Hidden Layer to be used:
Baseline: ReLu (Rectified Linear Unit)
- Leaky ReLu
- ELU (Exponential Linear Unit)
- SELU (Scaled Exponential Linear Unit)
- Swish
- PreLu (Parametric ReLu)
- Mish
- Sigmoid
- Tanh
- Maxout
- Softmax
- Adam (Adaptive Moment Estimation)
- Categorical Cross Entropy
- Accuracy
- Loss
- Confusion Matrix
- Precision
- Recall
- F1 Score
- Cross Validation like K-Fold
- Statistical Analysis techniques