/awesome-machine-learning-engineer

A curated awesome list of Machine Learning Engineering resources. Feel free to contribute! ๐Ÿš€

Awesome Machine Learning Engineer

Awesome

A curated list of delightful Machine Learning Engineering resources.
The resources are structured as follows: Title - Description (Reading time)
The descriptions are written so that they complete the sentence "After reading this article you will have learned to ...".

For more awesomeness, check out Awesome.

Contents

Communication

Software Engineering

API design

Version control

Code review

Python

Ideology

Logging

Linting

  • Flake8 - Enforce code style consistency in a Python project. (10 min)
  • Flake8 extensions - Pick the right Flake8 extensions.
  • Pylint
  • pydocstyle - Check compliance with Python docstring conventions. (5 min)

Testing

  • hypothesis-auto - Write fully automatic unit tests based on type annotations. (30 min)

Type annotation

Machine Learning

Practical Theory

While, in theory, you can just download Tensorflow and start making deep neural networks, it doesnโ€™t hurt to know some of the theory and philosophy that lies behind the algorithms that so many of us know and love today.

Sklearn

  • Custom Estimators - Create your own custom estimator (20 min)
  • Pipelines - Combine transformers and estimators into pipelines (15 min)
  • Pipelines and custom Estimators
  • Tuning hyperparameters - Implement grid search and randomized search for parameter optimization. (10 min)
  • TODO: Gridsearch vs random search vs Bayesian hyperparam optimization (gaussian processes)
  • TODO: Comparison of bayesian hyperparam optimizers (PyGPGO)

DevOps

Package management

  • Understanding Conda and Pip - Know the advantages of Conda over Pip. (5 min)
  • Conda tutorial - Manage packages and reproducible environments using one tool. (15 min)
  • Conda package index - Search for packages in Anaconda Cloud. (1 min)
  • Conda myths - Debunk some common myths and misconceptions about Conda. (5 min)
  • Conda in-depth
  • TODO: conda vs virtualenv, pyenv, pipenv.
  • TODO: explain how conda-forge works.
  • TODO: explain environment.yml + interactions with Docker.

Containerization

Shell

Terraform

Security

  • TODO: CVE scans (frontend and backend)
  • TODO: OSS license scan
  • TODO: mutual TLS, IP whitelisting, (VPN)

Infrastructure

Datastores

  • TODO: S3
  • TODO: DynamoDB
  • TODO: MongoDB

Message queues

Curated by Radix

Radix is a Belgium-based Machine Learning company.

We invent, design and develop AI-powered software. Together with our clients, we identify which problems within organizations can be solved with AI, demonstrating the value of Artificial Intelligence for each problem.

Our team is constantly looking for novel and better-performing solutions and we challenge each other to come up with the best ideas for our clients and our company.

Here are some examples of what we do with Machine Learning, the technology behind AI:

  • Help job seekers find great jobs that match their expectations. On the Belgian Public Employment Service website, you can find our job recommendations based on your CV alone.
  • Help hospitals save time. We extract diagnosis from patient discharge letters.
  • Help publishers estimate their impact by detecting copycat articles.

We work hard and we have fun together. We foster a culture of collaboration, where each team member feels supported when taking on a challenge, and trusted when taking on responsibility.

radix