Course: Big Data Optimization

Course Information

Instructor: Niao He

Course Description

Nearly every problem in machine learning and high-dimensional statistics can be formulated in terms of optimization of some function, possibly under some constraints. In the wake of Big Data era, there is an increasingly strong need to design efficient and scalable algorithms to address optimization problems in large scale - including problems exhibiting inherently high dimensions and problems involving truly massive data sets. The key aim of this course is to expose students to a wide range of modern algorithmic developments in optimization (mainly in convex optimization) and bring them near the frontier of research in large-scale optimization and machine learning.

Courtesy warning: The course will be theory-oriented and emphasize deep understanding of structures of optimization problems and computation complexity of numerical algorithms. The course will not cover any software or tools used for big data analytics and is not an application course.

Prerequisite

Students are expected to have strong knowledge of linear algebra, real analysis, and probability theory. Some prior exposure to optimization at a graduate level is preferred.

Textbook

No book is required, but you are highly recommended to read the following:

Ben-Tal & Nemirovski. Lectures on Modern Convex Optimization, SIAM, 2011.
Nesterov. Introductory Lectures on Convex Optimization: A Basic Course, Kluwer-Academic, 2003.
Sra, Nowozin, Wright (eds). Optimization for Machine Learning. MIT Press, 2011.
Bubeck. Convex Optimization: Algorithms and Complexity. In Foundations and Trends in Machine Learning, 2015.

Class Schedule

Module I: Introduction and Fundamentals

Lecture	Topic	Resources
1	Introduction to course	[Scribe]
2	Convex Sets	[Scribe]
3	Convex Functions	[Scribe]
4	Convex Functions	[Scribe]
5	Convex Optimization	[Scribe]
6	Conic Programming	[Scribe]

Module II: Smooth Convex Optimization

Lecture	Topic	Resources
7	Introduction to Optimization Algorithms	[Scribe]
8	Gradient Descent	[Scribe]
9	Gradient Descent and Its Acceleration	[Scribe]
10	Projected Gradient Descent	[Scribe]
11	Conditional Gradient (a.k.a. Frank Wolfe Algorithm)	[Scribe]
12	Coordinate Descent Algorithms	[Scribe]
13	Case Study on Logistic Regression	[Scribe]

Module III: Nonsmooth Convex Optimization

Lecture	Topic	Resources
14	Subgradient Method	[Scribe]
15	From Subgradient Descent to Mirror Descent	[Scribe]
16	Smoothing Techniques I	[Scribe]
17	Smoothing Techniques II & Proximal Point Algorithm	[Scribe]
18	Mirror Prox for Saddle Point Problems	[Scribe]
19	(Accelerated) Proximal Gradient Method	[Scribe]
20	Splitting Algorithms	[Scribe]

Module IV: Stochastic and Online Optimization

Lecture	Topic	Resources
21	Sample Average Approximation	[Scribe]
22	Stochastic Gradient Descent / Mirror Descent	[Scribe]
23	Incremental Gradient Methods for Finite Sum Problems	[Scribe]
24	Summary and Outlook	[Slides]

niaohe/Big-Data-Optimization-Course