This project demonstrates the process of building and optimizing a neural network model to classify images from the Fashion MNIST dataset using TensorFlow and Keras.
- Introduction
- Dataset
- Model Architecture
- Data Preprocessing
- Model Training
- Learning Rate Optimization
- Model Evaluation
- Results
- Visualizations
- Future Work
- Dependencies
This project aims to classify clothing items from the Fashion MNIST dataset using various neural network architectures and optimization techniques. We explore the impact of data normalization, model complexity, and learning rate scheduling on model performance.
We use the Fashion MNIST dataset, which consists of 70,000 grayscale images of 10 categories of clothing items. The dataset is split into 60,000 training images and 10,000 test images.
Categories:
- T-shirt/top
- Trouser
- Pullover
- Dress
- Coat
- Sandal
- Shirt
- Sneaker
- Bag
- Ankle boot
We experiment with multiple model architectures:
-
Basic model (Model 1):
- Flatten layer
- Dense layer (4 units, ReLU activation)
- Dense layer (4 units, ReLU activation)
- Output layer (10 units, softmax activation)
-
Deeper model (Model 2, 3, 4):
- Flatten layer
- Dense layer (4 units, ReLU activation)
- Dense layer (4 units, ReLU activation)
- Dense layer (4 units, ReLU activation)
- Output layer (10 units, softmax activation)
We normalize the pixel values of the images by dividing them by 255, scaling them to the range [0, 1].
We train the models using the following configuration:
- Loss function: Sparse Categorical Crossentropy
- Optimizer: Adam
- Metrics: Accuracy
- Epochs: 10-40 (varying by experiment)
We implement a learning rate scheduler to find the optimal learning rate for our model. The learning rate is increased exponentially over 40 epochs, and we plot the learning rate vs. loss to identify the best learning rate.
We evaluate our models using the following metrics:
- Training and validation accuracy
- Training and validation loss
- Confusion matrix
- Random image prediction visualization
Here are the accuracy results for our best performing model (Model 4):
Metric | Value |
---|---|
Training Accuracy | 81.71% |
Validation Accuracy | 80.3% |
- Experiment with convolutional neural network (CNN) architectures
- Implement data augmentation techniques
- Try transfer learning with pre-trained models
- Explore ensemble methods for improved accuracy
- TensorFlow 2.x
- NumPy
- Pandas
- Matplotlib
- Scikit-learn