/Algorithms-For-Data-Science

Course: Algorithms for Data Science

Primary LanguageJupyter Notebook

Lecturer: Hossein Hajiabolhassan
The Webpage of the Course: Algorithms For Data Science
Data Science Center, Shahid Beheshti University


Index:


Main TextBooks:

Book 1 Book 2

Slides and Papers

Recommended Slides & Papers:

  1. Introduction to Data Science

    Required Reading:
    
  2. Toolkit Lab: Jupyter NoteBook

    Required Reading:
    
  3. Toolkit Lab: Git & GitHub

    Required Reading:
    
  4. Introduction to Data Mining

    Required Reading:
    
  5. MapReduce and the New Software Stack

    Required Reading:
    
  6. Link Analysis

    Required Reading:
    
    Additional Reading:
    
  7. Toolkit Lab: Orange & Weka

    Required Reading:
    
    Additional Reading:
    
  8. Representative-Based Clustering

    Required Reading:
    
    Additional Reading:
    
  9. Hierarchical Clustering

    Required Reading:
    
    Additional Reading:
    
  10. Density-Based Clustering

    Required Reading:
    
  11. Spectral and Graph Clustering

    Required Reading:
    
    Additional Reading:
    
  12. Clustering Validation

    Required Reading:
    
    Additional Reading:
    
  13. Probabilistic Classification

    Required Reading:
    
    Additional Reading:
    
  14. Decision Tree Classifier

    Required Reading:
    

Additional Slides:

Class time and Location

Saturday and Monday 08:00-09:30 AM (Fall 2018), Room 208.

Grading:

  • Homework – 15%
    — Will consist of mathematical problems and/or programming assignments.
  • Midterm – 35%
  • Endterm – 50%

Two Written Exams:

Midterm Examination: Monday 1397/09/12, 08:00-10:00
Final Examination: Sunday 1397/10/16, 08:30-10:30

Prerequisites:

General mathematical sophistication; and a solid understanding of Algorithms, Linear Algebra, and Probability Theory, at the advanced undergraduate or beginning graduate level, or equivalent.

Linear Algebra:

Probability and Statistics:

Topics:

Have a look at some reports of Kaggle or Stanford students (CS224N, CS224D) to get some general inspiration.

Account:

It is necessary to have a GitHub account to share your projects. It offers plans for both private repositories and free accounts. Github is like the hammer in your toolbox, therefore, you need to have it!

Academic Honor Code:

Honesty and integrity are vital elements of the academic works. All your submitted assignments must be entirely your own (or your own group's).

We will follow the standard of Department of Mathematical Sciences approach:

  • You can get help, but you MUST acknowledge the help on the work you hand in
  • Failure to acknowledge your sources is a violation of the Honor Code
  • You can talk to others about the algorithm(s) to be used to solve a homework problem; as long as you then mention their name(s) on the work you submit
  • You should not use code of others or be looking at code of others when you write your own: You can talk to people but have to write your own solution/code

Questions?

I will be having office hours for this course on Monday (09:30 AM--12:00 AM). If this is not convenient, email me at hhaji@sbu.ac.ir or talk to me after class.