100-Days-of-ML-Pt2

Daily log to track my progress on the 100 days of ML code challenge.

Description

100 Day ML Challenge to learn and develop machine learning products. Since this is my second time performing this challenge, this time around I will be focusing more on the production enviroment rather than the concepts and theory behind ML/DL models. I will be placing heavy emphasis on the ML pipeline and the process of taking an ML model and applying into a real-world application.

Challenge Goals

[Additional]

Milti-modal Systems
Apache Flume
Apache Beam
Apache Airflow
Using Amdahl's Law
MySQL/Apache Spark
PostgreSQL
Optimizing/architecting software/hardware solutions for ML

Resources

Books

Deploy Machine Learning Models to Production With Flask, Streamlit, Docker, and Kubernetes on Google Cloud Platform

Building Machine Learning Powered Applications Going from Idea to Products

Building Machine Learning Pipelines Automating Model Life Cycles with TensorFlow

Hands-On GPU Programming with Python and CUDA: Explore High-performance Parallel Computing with CUDA

Web Resources

Made with ML

@article{madewithml, title = "MLOps - Made With ML", author = "Goku Mohandas", url = "https://madewithml.com/courses/mlops/organization/" year = "2021", }

Youtube Log

This whole challenge will be documented on youtube during live streams. The link to the playlist: 100 Days of ML

Daily Goals

Day 1: Deploy a Linear Regression model using FLASK

Develop a web application using FLASK
Code a linear regression model
Deploy the trained model as a REST service

Day 2: Deploy a Linear Regression model using Streamlit

Read the section about streamlit
Create UI using streamlit
Deploy the trained model as a REST service
code LSTM model

Day 3: Deploy a Deep Learning model using Streamlit

Train the LSTM model
Create UI using streamlit

Day 4: Deploy a Deep Learning model using Streamlit

Deploy the trained model as a REST service
Read Chapter 4: ML Deployment using Docker

Day 5: Deploy ML model using Docker

Create a dockerfile for Flask App
Create Docker image
Push our Docker image to DockerHub

Day 6: ML Deployment using Kubernetes

Read chapter 5: ML Deployment using Kubernetes
Create GCP Project
Enable and utilize the Kubernetes Engine API on GCP

Day 7: Scripting

Read topics under scripting from MadeWithML
Apply learning by adding it to current projects

Day 8: Reading Building ML Pipelines

Read Ch1: Introduction
Read Ch2: Introduction to TensorFlow Extended

Day 9: Reading Building ML Pipelines

Read Ch2: Introduction to TensorFlow Extended
Read Ch3: Data Ingestion

Day 10: TFX Setup

Download and setup TFX
Execute TFX Data Ingestion examples

Day 11: TFX Data Ingestion

Follow [TFX tutorial](https://www.tensorflow.org/tfx/tutorials/tfx/penguin_simple#install_tfx)

Day 12: TFX Data Ingestion

Try using Colab
Check online resources and try to debugg [TFX tutorial](https://www.tensorflow.org/tfx/tutorials/tfx/penguin_simple#install_tfx)

Day 13: Reading Building ML Pipelines

Read Ch4: Data Validation
Read online resources on TFX Data Validation

Day 14: Data Validation

Read online resources on TFX Data Validation
Execute example code for TFVD

Day 15: Reading Building ML Pipelines

Read Ch5: Data Preprocessing
Read online resources on TFX Data Preprocessing

Day 16: Data Preprocessing

Read more about feature engineering
Feature engineering vs ML engineering
Execute example code for TF Transform

Day 17: MLOps Production

Read about production from [Made with ML](https://madewithml.com/courses/mlops/)
Watch youtube videos on CI/CD workflows
Learn more about Github Actions

Day 18: Reading Building ML Pipelines

Read Ch 6: Model Training
Read online resources on TFX Trainer Component

Day 19: Reading Building ML Pipelines

Read Ch 7: Model Analysis and Validation
Read online resources on TF Model Analysis

Day 20: Reading Building ML Pipelines

Read Ch 8: Model Deployment with TensorFlow Serving

Day 21: Reading Building ML Pipelines

Continue reading Ch8: Model Deployment with TensorFlow Serving
Read online resources on TF Serving

Day 22: Plan out an ML Product to Build

Look into simple ML models to deploy
Create the architecture for the pipeline
Setup/Decide on github project organization
Choose how to build front-end
What orchestration tool to use?

Day 23: Plan out an ML Product to Build

How to integrate CI/CD into project
FLASK Web deployment vs Model Server
Create order of events
Look into setting up GitHub Project

Day 24: Build Dockerfile for ML Project

List out dependencies
Create Dockerfile
Build Docker Image
Run Docker container
Check if GPU is being used

Day 25: Build Dockerfile for ML Project

Troubleshoot Dockerfile
Build Docker Image
Update README.md

Day 26: Setup TFX Pipeline Architecture

Setup TFX pipeline
Design TFX architecture

Day 27: TFX Pipeline

Understand the template code
Clear out the template files and rewrite the pipeline

Day 28: Import Data

Change permissions for files in /ml
Add the IMDB datset into /data
Convert dataset to desired format

Day 29: Add Formating/Styling

Setup formating/styling
Black & flake8
Setup github actions

Day 30: Data Validation

Fix features.py
Generate statistics
Visualize statistics
Infer schema

Day 31: Troubleshooting Data Ingestion/Validation

Build updated docker image
Figure out why example gen is not loading

Day 32: Data Preprocessing

Change preprocessing.py
Test Preprocessing

Day 33: Continue Data Preprocessing

Change preprocessing.py
Test Preprocessing

Day 34: Continue Data Preprocessing

Test Preprocessing
Add TFX Transform to tfx pipeline components

Day 35: Model Training

Prototype the tfx trainer component in jupyter
Modify model.py

Day 36: Continue Model Training

Modify model.py
Add tfx trainer component to pipeline
Test pipeline

Day 37: Read Building ML Pipelines

Read ch: 12 of Building Ml Piplines - Kubeflow pipeline

Day 38: Reading Building ML Pipelines

Read ch: 11 of Building Ml Piplines - Apache Beam & Apache Airflow
Read [Graph-based Neural Structured Learning in TFX](https://www.tensorflow.org/tfx/tutorials/tfx/neural_structured_learning#the_trainer_component)

Day 39: Debug Model Training

Figure out how the tfx IMDB example in tensorflow tutorial works
Understand the use for custom tfx components in the tutorial
Understand the preprocessing required for the model using the IMDB dataset

Day 40: Find resources for learning CUDA

Find a book to read for CUDA
Search for alternative resources online

Day 41: Start reading GPU Programming with CUDA

Download a digital version of the book
Start the introductory chapters
Using Amdahl's Law

Day 42: Start reading GPU Programming with CUDA

Learn about Mandelbrot set
What are profilers? cProfile module
Setting up GPU programming environment in Linux

Day 43: Scouting Internships

Search for research internships
Create a spredsheet with all the important dates listed
Create and organize a small google document with all relavent links and information

Day 44: Setup development enviroment for CUDA

Test on local environment
Figure out depencies
Create Dockerfile
Build Docker image
Test docker container with all the dependencies for programming CUDA using Python

Day 45: Read GPU Programming with CUDA

Read chapter 3: Getting Started with PyCUDA
CPU vs GPU timing
Parallelizing the mandelbrot set

Day 46: Troubleshoot TFX Transform component

Re-implement the tokenization
Use keras TextVectorizer
Build sentence sequences

Day 47: Mandelbrot Set

Code using sequential utilizing CPU
Parallelize the code to run on GPU

Day 48: Read GPU Programming with CUDA

Funcitonal programming
Parallel scan and reduction kernel basics
Kernels, Threads, Blocks, and Grids

Day 49: Write article on 100 Days of ML Pt2

Talk about all of the knowledge I've gained so far
Current objectives
Future tasks

Day 50: Review medium article on 100 Days of ML Pt2

Finish up future tasks
Include GitHub
Link website

Day 51: Read ML Applications

Start with introduction
Read chapter 1

Day 52: Read ML Applications

Continue chapter 1 - From Product Goal to ML Framing
Start chapter 2 - Create a Plan
Start reading ML Paper every week

Day 53: Read ML Applications

Finish chapter 2 - Create a Plan
Review Part I. Find the Correct ML Approach
Read more about model metrics

Day 54: Implement learning from chapters 1 and 2

Find an existing problem
Determine if it can be solved using ML
Look for datasets and determine what model would work

Day 55: Read ML Applications

Read Chapter 3 - Build Your First End-to-End Pipeline
Read Chapter 4 - Aquire an Initial Dataset

Day 56: Reading ML Applications

Continue Chapter 4
Review Part II. Build a Working Pipeline

Day 57: Setting up Paper-a-Week Repo

Setup GitHub repo for paper-a-week challenge
Decide on a list of papers to read
Start reading the first paper

Day 58: Reading ML Applications

Read Part III - Iterate on Models
Read Chapter 5 - Train and Evaluate Your Model

Day 59: Reading ML Applications

Finish reading Ch 5

Day 60: Reading ML Applications

Read Chapter 6 - Debug Your ML Problems

Day 61: Reading ML Applications

Finish reading Chapter 6
Think of ways to test preprocessing in ML-Pipelines project

Day 62: Prototype ML Pipelines Project

Build the model on Jupyter Notebook
Finish Data Ingestion

Day 63: ML Project pt 17 - Prototype

Finish filtering HTML tags from string

Day 64: ML Project pt 18 - Prototype

Vectorize the filtered string

Day 65: ML Project pt 19 - Prototype

Figure out how to create embeddings
Create embeddings

Day 66: ML Project pt 20 - Prototype

Analyze processed data
What is the model doing?

Day 67: ML Project pt 21 - Prototype

How to build the DL model?
Build the DL model
Run tests

Day 68: Reading Data Science paper

Read and take notes on Data preprocessing - Tidy data - by Hadley Wickham

Day 69: Reading Tidy Data

Continue with section 3
Revise sections 1-3

Day 70: Reading Tidy Data

Continue with section 4
Finished reading Tidy Data

Day 71: Reading ML Applications

Start Chapter 7 - Using Classifiers for Writing Recommendations

Day 72: Reading ML Applications

Finish Chapter 7
Review Part III - Iterate on Models

Day 73: Paper a Week

Read Statistical Modeling: The Two Cultures - by Leo Breiman

Day 74: Paper a Week

Finished Section 3 and 4

Day 75: Paper a Week

Read section 5 - the use of data models

Day 76: Paper a Week

Start section 6 - the limitations of data models

Day 77: Paper a Week

Start section 8 - RASHOMON AND THE MULTIPLICITY OF GOOD MODELS

Day 78: Paper a Week

Read section 9

Day 79: Prepare for ML Confrence

Fix info on the slides
Make up a presentation
Add/subtract information

Day 80: Paper a Week

Read section 10

Day 81: Paper a Week

Read section 11
Read section 12
Finished Statistical Modeling: The Two Cultures by Leo Breiman

Day 82: Paper a Week

Start reading A Study in Rashomon Curves and Volumes: A New Perspective on Generalization and Model Simplicity in Machine Learning (Semenova et al)
Update the Paper a Week repository with annotated papers
Checked out Ishan Misra and Yann LeCun's blog post on Self-supervised learning

Day 83: Self-supervised learning

Finish reading [Self-supervised learning: The dark matter of intelligence](https://ai.facebook.com/blog/self-supervised-learning-the-dark-matter-of-intelligence/)
Continue with Paper a Week: Rashomon Curves and Volumes

Day 84: Paper a Week

Read Paper a Week: Rashomon Curves and Volumes

Day 85: Paper a Week

Explore the different statistical concepts introduced in Rashomon Curves & Volumes
Continue reading Rashomon Curves and Volumes

Day 86: AI/ML Research

Learn about the different complexity measures introduced in the Rashomon Cruves paper
Read [How to do Research At the MIT AI Lab](https://dspace.mit.edu/bitstream/handle/1721.1/41487/AI_WP_316.pdf?sequence=4&isAllowed=y)
Checked out reading lists for AI from [Stanford](http://i.stanford.edu/pub/cstr/reports/cs/tr/86/1093/CS-TR-86-1093.pdf) and [Berkley](https://ml.berkeley.edu/reading-list/)

Day 87: EfficientNet

Read the [EfficientNet](https://arxiv.org/pdf/1905.11946.pdf) paper
Learn about FLOPS

Day 88: Yolo

Read the [Yolo](https://arxiv.org/abs/1506.02640) paper

Day 89: Implementing EfficientNet

Lay out the architecture of EfficientNet
Figure out if its possible to code it using TensorFlow
Start implementation

Day 90: Read MobileNetV2 & MnasNet

Understand what MBConv blocks are
Learn about inverted residual convolution

Day 91: Read MobileNetV2

Get an understanding of the Bottleneck residual block
What are dwise layers?

Day 92: Implement EfficientNet

Start by implementing a simple convblock
Design and code the MBConv block

Day 93: Read PointNet

Learn about the architecture of PointNet
Continue with the berkely reading list

Day 94: Read Yann LeCun's paper on Deep Learning

Continue reading the paper

Day 95: Semi Supervised Learning

[DeepMind’s New Super Model: Perceiver IO is a Transformer that can Handle Any Dataset](https://pub.towardsai.net/deepminds-new-super-model-perceiver-io-is-a-transformer-that-can-handle-any-dataset-dfcffa85fe61)
Breifly learn about multi-modal models
[Learn about semi-supervised learning models](http://www.cs.cmu.edu/~10701/slides/17_SSL.pdf)
What is the co-training algorithm

Day 96: Co-Training Algorithm

Learn about the co-training algorithm in depth
Read [Combining Labeled and Unlabeled Data with Co-Trainingy (Mitchell et all)](https://www.cs.cmu.edu/~avrim/Papers/cotrain.pdf)
Review SSL with [Semi-Supervised Learning](http://pages.cs.wisc.edu/~jerryzhu/pub/SSL_EoML.pdf)

Day 97: Multi-Modal

Learn more about the functionality of multi-modal models

Day 98: SSL and Weakly Supervised Learning

Learn about the difference between SSL and WSL

Day 99: Write article about 100 Days of ML

Start on the medium article

Day 100: Challenge Completed

Finish the medium article

Harsh188/100-Days-of-ML-Pt2