Fashion MNIST Classification with TensorFlow

This project demonstrates the process of building and optimizing a neural network model to classify images from the Fashion MNIST dataset using TensorFlow and Keras.

Introduction
Dataset
Model Architecture
Data Preprocessing
Model Training
Learning Rate Optimization
Model Evaluation
Results
Visualizations
Future Work
Dependencies

Introduction

This project aims to classify clothing items from the Fashion MNIST dataset using various neural network architectures and optimization techniques. We explore the impact of data normalization, model complexity, and learning rate scheduling on model performance.

Dataset

We use the Fashion MNIST dataset, which consists of 70,000 grayscale images of 10 categories of clothing items. The dataset is split into 60,000 training images and 10,000 test images.

Categories:

T-shirt/top
Trouser
Pullover
Dress
Coat
Sandal
Shirt
Sneaker
Bag
Ankle boot

Model Architecture

We experiment with multiple model architectures:

Basic model (Model 1):
- Flatten layer
- Dense layer (4 units, ReLU activation)
- Dense layer (4 units, ReLU activation)
- Output layer (10 units, softmax activation)
Deeper model (Model 2, 3, 4):
- Flatten layer
- Dense layer (4 units, ReLU activation)
- Dense layer (4 units, ReLU activation)
- Dense layer (4 units, ReLU activation)
- Output layer (10 units, softmax activation)

Data Preprocessing

We normalize the pixel values of the images by dividing them by 255, scaling them to the range [0, 1].

Model Training

We train the models using the following configuration:

Loss function: Sparse Categorical Crossentropy
Optimizer: Adam
Metrics: Accuracy
Epochs: 10-40 (varying by experiment)

Learning Rate Optimization

We implement a learning rate scheduler to find the optimal learning rate for our model. The learning rate is increased exponentially over 40 epochs, and we plot the learning rate vs. loss to identify the best learning rate.

Model Evaluation

We evaluate our models using the following metrics:

Training and validation accuracy
Training and validation loss
Confusion matrix
Random image prediction visualization

Results

Here are the accuracy results for our best performing model (Model 4):

Metric	Value
Training Accuracy	81.71%
Validation Accuracy	80.3%

Visualizations

Learning Curves

Learning Rate vs. Loss

Confusion Matrix

Random Image Prediction

Future Work

Experiment with convolutional neural network (CNN) architectures
Implement data augmentation techniques
Try transfer learning with pre-trained models
Explore ensemble methods for improved accuracy

Dependencies

TensorFlow 2.x
NumPy
Pandas
Matplotlib
Scikit-learn

TrishamBP/multiclass-classification-tensorflow