The following cloud-based programming environment will be provided to each student: Python, Python libraries for linear algebra, plotting, machine learning, TensorFlow, Storm, Spark, and Github for submitting project code.
This course does not assume any prior exposure to data analytics or machine learning theory or practice. And graduate students familiar with principle of programming should be successful in completing this course.
Machine Learning is concerned with computer programs that enable the behavior of a computer to be learned from examples or experience rather than dictated through rules written by hand. This class is meant to teach the practical side of machine learning for applications, such as mining newsgroup data or building adaptive user interfaces. While it will be essential to learn conceptually how machine learning algorithms work and interact with data, the emphasis will be on effective methodology for using data analytics and machine learning to solve practical problems. It is about knowing how to conceptualize a problem, knowing how to represent your data, being able to interpret your results properly, doing an effective error analysis, and using the results of the error analysis.
This course does not assume any prior exposure to data analytics or machine learning theory or practice. And graduate students familiar with principle of programming should be successful in completing this course. The main aim of the course is to provide skills to apply machine learning algorithms on real applications. We will consider fewer learning algorithms and less time on math and theory and instead spend more time on hands-on skills required for algorithms to work on a variety of data sets. There will be a heavy project focus, and when you have completed the course, you should be fully prepared to attack new problems using machine learning.
No required text book. Slides will be uploaded in GitHub.
-
Witten, I. H., Frank, E., and Hall, M. (2011). Data Mining: Practical Machine Learning Tools and Techniques, third edition, Elsevier: San Francisco, ISBN 978-0-12-374856- 0
-
Machine Learning, Tom Mitchell McGraw-Hill (1997)
-
A Course in Machine Learning, Hal Daume III (preprint available online)
Quizzes (10%)
Assignments (20% total)
Mid-terms (15% each)
Course project (45%)
- Introduction to Big Data, Data Analytics, and Machine Learning
- Python Data Structures for working with Data
- A carsh course on Cloud Computing and Jupyter Notebook
- A crash course on Python data structures
- Real-World Data Samples
- Linear Algebra
- Probability and Information Theory
- Linear Regression and Classification
- Logistic Regression
- Building Machine Learning Models
- Deep Learning
- Convolution Network
- Unsupervised Learning
- Recurrent NNs
- Deep NLP
- Reinforcement Learning
- Apache Spark: Machine Learning on Big Data
- K-Means Clustering
- Principal Component Analysis
- Bayesian Methods
- Decision Trees and Random Forests
- Support Vector Machines
- K-Nearest Neighbor