/SysBioShortCourse

Primary LanguageJupyter Notebook

Modern systems biology methods make heavy use of algorithms that automatically determine useful data features and tune algorithm parameters to maximize accuracy. This lecture will introduce basic formulations of machine learning problems, including supervised and unsupervised learning problems, and study their applications in genomics and systems biology.

Lecture Topics:

Classification and regression, support vector machines, randomized decision forests, feature selection, neural nets, regularization, K-means clustering, PCA, TSNE, clustering

Key Concepts Learned:

Students will learn standard formalisms for specifying machine learning tasks and basic mathematical and algorithmic tools for learning predictors from data.

Software: Python 3.6 installed through Anaconda is recommended: https://docs.anaconda.com/anaconda/install/

References:

The Elements of Statistical Learning: Data Mining, Inference, and Prediction. T. Hastie, R. Tibshirani, J. Friedman http://statweb.stanford.edu/~tibs/ElemStatLearn/

Machine learning: a probabilistic perspective Kevin Murphy https://www.cs.ubc.ca/~murphyk/MLbook/