Applied Machine Learning

Lecturer: Hossein Hajiabolhassan

Data Science Center

Shahid Beheshti University
Teaching Assistants:
Zahra Taheri Ali Hojatnia Yavar T. Yeganeh
Fatemeh Amanian Mahdis Hosseini Sohrab Faridi

Index:


Course Overview:

Machine learning is an area of artificial intelligence that provides systems the ability to 
automatically learn. Machine learning allows machines to handle new situations via analysis, 
self-training, observation and experience. The wonderful success of machine learning has made 
it the default method of choice for artificial intelligence experts. In this course, we review 
the fundamentals and algorithms of machine learning. 

TextBooks:

Main TextBooks:
Additional TextBooks:

Slides and Papers:

Recommended Slides & Papers:

  1. Toolkit Lab (Part 1: Anaconda, Jupyter Lab, Markdown, Git, GitHub, and Google Colab)

Required Reading:

Anaconda, Jupyter Lab, Markdown, Git, GitHub, and Google Colab:

Teaching Assitant Class:
Python continues to take leading positions in solving data science tasks and challenges.
Here are three of the most important of libraries.
Numpy is the fundamental
package for scientific
computing with Python.
Pandas is an easy-to-use data
structures and data analysis tools
Matplotlib is a Python
2D plotting library
Resources:
Scipy Lecture Notes Data Science iPython Notebooks
Suggested Reading:
Additional Resources:
Python Overview [Word]


Python Tutorial [PDF] [Code]
Numpy [PDF] [Code]
User Guide [Link]
Quickstart [Link]
Reference [Link]
Practice Numpy in LabEx [Link]
Cheatsheet [Link]
Matplotlib [PDF][Code]
Example [Link]
Tutorials [Link]
Reference [Link]
Practice Matplotlib in LabEx [Link]
Cheatsheet [Link]
Pandas [Code]
10 Min to Pandas [Link]
Cookbook [Link]
Tutorials [Link]
Reference [Link]
Practice Pandas in LabEx [Link]
Cheatsheet [Link]
Seaborn: Stat data
Visulization [Link]
Example [Link]
Tutorials [Link]
Reference [Link]
Cheatsheet [Link]
Scikit Learn [Link]
Scikit Image [Link]
Scikit Tutorial #1 [Code]
Scikit Tutorial #2 [Code]
Cheatsheet [Link]
  1. Introduction

    Required Reading:
    
  2. Empirical Risk Minimization

    Required Reading:
    
  3. PAC Learning

    Required Reading:
    
  4. Learning via Uniform Convergence

    Required Reading:
    
  5. The Bias-Complexity Tradeoff

    Required Reading:
    
    Suggested Reading:
    
    Additional Reading:
    
  6. The VC-Dimension

    Required Reading:
    
  7. Toolkit Lab (Part 2)

    Required Reading:
    
  8. Linear Predictors

    Required Reading:
    
    Additional Reading:
    
    R (Programming Language):
    
  9. Decision Trees

    Required Reading:
    
    Additional Reading:
    
    R (Programming Language):
    
  10. Nearest Neighbor

    Required Reading:
    
    Additional Reading:
    
    R (Programming Language):
    
  11. Ensemble Methods

    Required Reading:
    
    Additional Reading:
    
    R (Programming Language):
    
  12. Model Selection and Validation

    Required Reading:
    
    Suggested Reading:
    
    Additional Reading:
    
    R (Programming Language):
    
  13. Neural Networks

    Required Reading:
    
    Additional Reading:
    
    R (Programming Language):
    
  14. Convex Learning Problems

    Required Reading:
    
    Additional Reading:
    
  15. Regularization and Stability

    Required Reading:
    
    Additional Resources:
    
    R (Programming Language):
    
  16. Support Vector Machines

    Required Reading:
    
    Additional Reading:
    
    R (Programming Language):
    
  17. Multiclass Classification

    Required Reading:
    

Class Time and Location:

Saturday, Monday, Wednesday 10:30-12:00 PM

Recitation and Assignments:

Monday and Wednesday 17:00-18:30 PM Refer to the following link to check the assignments.
Also, I recommend to study link of recitation and assignments of machine learning in 2020.

Projects:

Projects are programming assignments that cover the topic of this course. Any project is written by
Jupyter Notebook. Projects will require the use of Python 3.7, as well as
additional Python libraries as follows.

  • Python 3.7: An interactive, object-oriented, extensible programming language.
  • NumPy: A Python package for scientific computing.
  • Pandas: A Python package for high-performance, easy-to-use data structures and data analysis tools.
  • Scikit-Learn: A Python package for machine learning.
  • Matplotlib: A Python package for 2D plotting.
  • SciPy: A Python package for mathematics, science, and engineering.
  • IPython: An architecture for interactive computing with Python.

Practical Guide:

Fascinating Guide to Use Python Libraries (Machine Learning):

Google Colab:

Google Colab is a free cloud service and it supports free GPU!

Latex:

The students can include mathematical notation within markdown cells using LaTeX in their Jupyter Notebooks.

  • A Brief Introduction to LaTeX PDF
  • Math in LaTeX PDF
  • Sample Document PDF

Useful NoteBooks:

Grading:

  • Homework – 30%
    — Will consist of mathematical problems and/or programming assignments.
  • Midterm – 20%
  • Endterm – 50%

Three Written Exams:

Midterm Examination: Saturday 1400/09/27, 10:30-12:00
Final Examination: Thursday 1400/11/07, 14:00-16:00

Prerequisites:

General mathematical sophistication; and a solid understanding of Algorithms, Linear Algebra, and Probability Theory, at the advanced undergraduate or beginning graduate level, or equivalent.

Linear Algebra:

Probability and Statistics:

Discrete Mathematics:

Course (Videos, Lectures, Assignments): MIT OpenCourseWare (Discrete Mathematics)

Topics:

Have a look at some reports of Kaggle or Stanford students (CS224N, CS224D) to get some general inspiration.

Account:

It is necessary to have a GitHub account to share your projects. It offers plans for both private repositories and free accounts. Github is like the hammer in your toolbox, therefore, you need to have it!

Academic Honor Code:

Honesty and integrity are vital elements of the academic works. All your submitted assignments must be entirely your own (or your own group's).

We will follow the standard of Department of Mathematical Sciences approach:

  • You can get help, but you MUST acknowledge the help on the work you hand in
  • Failure to acknowledge your sources is a violation of the Honor Code
  • You can talk to others about the algorithm(s) to be used to solve a homework problem; as long as you then mention their name(s) on the work you submit
  • You should not use code of others or be looking at code of others when you write your own: You can talk to people but have to write your own solution/code

Questions?

I will be having office hours for this course on Monday (09:30 AM--12:00 AM). If this is not convenient, email me at hhaji@sbu.ac.ir or talk to me after class.