A complete implementation of a Softmax classifier with cross-entropy loss for multi-class classification, applied to the CIFAR-10 dataset.
This project implements both naive and vectorized versions of the Softmax classifier with the following features:
- Numerical Stability: Proper handling of overflow/underflow issues
- Vectorized Implementation: Efficient matrix operations for faster training
- Cross-entropy Loss: Standard loss function for multi-class classification
- L2 Regularization: Prevents overfitting
- CIFAR-10 Dataset: Real-world image classification task
- ✅ Naive implementation with explicit loops
- ✅ Vectorized implementation for efficiency
- ✅ Numerical stability improvements
- ✅ L2 regularization support
- ✅ Training visualization
- ✅ Model saving/loading
- ✅ Performance comparison between implementations
- Clone the repository:
git clone https://github.com/HoomKh/softmax-classifier.git
cd softmax-classifier
- Install dependencies:
pip install -r requirements.txt
- Download the CIFAR-10 dataset:
- Place the
cifar-10-batches-py
folder in thedataset/
directory - Or update the path in the notebook to point to your dataset location
- Place the
softmax-classifier/
├── README.md
├── requirements.txt
├── softmax.py # Core softmax implementations
├── linear_classifier.py # Linear classifier base class
├── data_utils.py # Data loading utilities
├── softmax_classifier.ipynb # Main training notebook
└── dataset/ # CIFAR-10 dataset (not included)
└── cifar-10-batches-py/
- Start Jupyter:
jupyter notebook
-
Open
softmax_classifier.ipynb
-
Run all cells to:
- Load and preprocess CIFAR-10 data
- Train the softmax classifier
- Visualize training progress
- Evaluate model performance
- Learning Rate:
1e-7
(default) - Regularization:
2.5e4
(default) - Training Iterations:
1500
(default) - Batch Size:
200
(default)
The softmax loss combines:
- Softmax activation:
P(y=k|x) = exp(s_k) / Σ exp(s_i)
- Cross-entropy loss:
L = -log(P(y_true|x))
- L2 regularization:
R(W) = λ * Σ W²
The implementation includes several numerical stability improvements:
- Subtracting maximum score before exponential
- Proper gradient normalization
- Correct regularization loss calculation
On CIFAR-10 dataset:
- Training Accuracy: ~32.6%
- Validation Accuracy: ~34%
- Training Time: ~0.9 seconds for 1500 iterations
Implementation | Training Time | Loss |
---|---|---|
Naive | ~2.5s | 2.34 |
Vectorized | ~0.07s | 2.34 |
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Email: khoshbinhooman@gmail.com
- GitHub: HoomKh
- CIFAR-10 dataset from CIFAR
- Implementation based on CS231n course materials
- Module not found errors: Ensure all dependencies are installed
- Dataset path errors: Update the
cifar10_dir
path in the notebook - Memory issues: Reduce batch size or use smaller dataset subset
- Numerical instability: The implementation includes fixes for this
If you encounter any issues:
- Check the troubleshooting section above
- Search existing issues on GitHub
- Create a new issue with detailed error information