Welcome to the Machine Learning and Deep Learning practice exercises! This repository contains weekly exercises designed to help you practice various classification algorithms using TensorFlow and deep learning techniques. Each week focuses on a different algorithm, including regularization and optimization techniques, and culminates in Natural Language Processing (NLP) tasks.
Exercise: Implement a binary logistic regression model to classify whether a patient has diabetes using the Pima Indians Diabetes Database.
Dataset: Pima Indians Diabetes Database
Tasks:
- Load and preprocess the dataset (handle missing values, normalization).
- Build a logistic regression model using TensorFlow.
- Train the model with different regularization strengths and evaluate its performance using metrics such as accuracy, precision, recall, and F1-score.
- Visualize the training process (loss curve) and compare performance with and without regularization.
Exercise: Implement the k-NN algorithm to classify handwritten digits using the MNIST dataset. Use TensorFlow to manage data but implement k-NN from scratch.
Dataset: MNIST
Tasks:
- Load the MNIST dataset and preprocess it.
- Implement the k-NN algorithm from scratch.
- Tune the hyperparameter ( k ) to find the optimal number of neighbors.
- Use TensorFlow functions to handle the dataset and calculate distances.
- Evaluate the model on the test set and analyze its performance.
Exercise: Build a decision tree classifier to predict species of iris flowers using the Iris dataset. Implement the model using TensorFlow Decision Forests.
Dataset: Iris Dataset
Tasks:
- Load and preprocess the Iris dataset.
- Use TensorFlow Decision Forests to build the decision tree model with pruning.
- Train and evaluate the model using appropriate metrics.
- Visualize the decision tree and discuss feature importance.
Exercise: Implement an SVM to classify types of cancer using the Breast Cancer Wisconsin (Diagnostic) dataset. Use TensorFlow for data handling and Scikit-learn for the SVM implementation.
Dataset: Breast Cancer Wisconsin (Diagnostic) Dataset
Tasks:
- Load and preprocess the dataset.
- Implement an SVM using Scikit-learn.
- Perform hyperparameter tuning (C, gamma) to optimize model performance.
- Use TensorFlow to handle data preprocessing and visualization.
- Train the SVM model and evaluate its performance.
Exercise: Build a Naive Bayes classifier to classify SMS messages as spam or not spam using the SMS Spam Collection dataset. Implement the model using TensorFlow and Scikit-learn.
Dataset: SMS Spam Collection Dataset
Tasks:
- Load and preprocess the dataset (tokenization, TF-IDF).
- Implement a Naive Bayes classifier using Scikit-learn.
- Apply Laplace smoothing to handle zero probabilities.
- Use TensorFlow for data handling and visualization.
- Train and evaluate the model on the dataset.
Exercise: Build a Random Forest classifier to predict whether a passenger survived the Titanic disaster. Use TensorFlow Decision Forests.
Dataset: Titanic Dataset
Tasks:
- Load and preprocess the Titanic dataset (handle missing values, encoding categorical variables).
- Implement a Random Forest classifier using TensorFlow Decision Forests.
- Perform hyperparameter tuning (number of trees, max depth) to optimize model performance.
- Train and evaluate the model using appropriate metrics.
- Visualize the feature importance and discuss the results.
Exercise: Implement a simple neural network to classify fashion items using the Fashion MNIST dataset. Use TensorFlow/Keras for the implementation.
Dataset: Fashion MNIST
Tasks:
- Load and preprocess the Fashion MNIST dataset.
- Build a neural network with dropout and L2 regularization using TensorFlow/Keras.
- Train the model and evaluate its performance using accuracy and loss metrics.
- Visualize the training process and the performance of the model on test data.
Exercise: Implement a CNN to classify images in the CIFAR-10 dataset. Use TensorFlow/Keras for the implementation.
Dataset: CIFAR-10
Tasks:
- Load and preprocess the CIFAR-10 dataset.
- Build a CNN using TensorFlow/Keras.
- Apply data augmentation techniques (rotation, flipping, cropping) to the training data.
- Train the model and evaluate its performance using accuracy and loss metrics.
- Visualize the training process and some of the learned filters.
Exercise: Implement an RNN to classify sequences from the IMDB movie review dataset as positive or negative. Use TensorFlow/Keras for the implementation.
Dataset: IMDB Movie Reviews
Tasks:
- Load and preprocess the IMDB dataset (tokenization, padding).
- Build an RNN (LSTM/GRU) using TensorFlow/Keras.
- Train the model and evaluate its performance using accuracy and loss metrics.
- Visualize the training process and discuss the results.
Exercise: Implement an ensemble model combining different classifiers (e.g., logistic regression, decision trees, and SVM) to classify the Wine dataset. Use TensorFlow for data handling and Scikit-learn for the ensemble methods.
Dataset: Wine Dataset
Tasks:
- Load and preprocess the Wine dataset.
- Implement individual classifiers using Scikit-learn.
- Combine the classifiers using ensemble techniques (e.g., voting, stacking) with regularization.
- Train and evaluate the ensemble model.
Exercise: Use a pre-trained model (e.g., VGG16, ResNet) to classify images in the Cats vs. Dogs dataset. Use TensorFlow/Keras for the implementation.
Dataset: Cats vs. Dogs
Tasks:
- Load and preprocess the Cats vs. Dogs dataset.
- Use a pre-trained model and fine-tune it for the classification task.
- Apply data augmentation and regularization techniques to improve performance.
- Train the model and evaluate its performance.
- Visualize the training process and discuss the results.
Exercise: Implement a GAN to generate new images based on the MNIST dataset. Use TensorFlow/Keras for the implementation.
Dataset: MNIST
Tasks:
- Load and preprocess the MNIST dataset.
- Build and train a GAN using TensorFlow/Keras with gradient penalty for stability.
- Generate new images and evaluate the quality of generated images.
- Visualize the training process and the generated images.
Exercise: Fine-tune a BERT model for text classification on the IMDB movie review dataset. Use TensorFlow/Keras for the implementation.
Dataset: IMDB Movie Reviews
Tasks:
- Load and preprocess the IMDB dataset (tokenization using BERT tokenizer).
- Fine-tune a pre-trained BERT model for the classification task using TensorFlow/Keras.
- Apply techniques like learning rate scheduling and dropout for regularization.
- Train the model and evaluate its performance using accuracy and loss metrics.
- Visualize the training process and discuss the results.
- Clone this repository to your local machine.
- Set up your Python environment with the required libraries (TensorFlow, Scikit-learn, etc.).
- Follow the weekly exercises and complete the tasks.
- Share your results and insights with your friends for discussion and further learning.
- Python 3.6+
- TensorFlow 2.0+
- Scikit-learn
- Matplotlib (for visualization)
- Jupyter Notebook (optional but recommended)
Feel free to contribute by adding more exercises, improving existing ones, or fixing any issues. Fork this repository and submit a pull request with your changes.
This project is licensed under the MIT License - see the LICENSE file for details.
Happy learning and coding!