Hypotheses and Models in Data Intensive Domains Course, @hamc2019, Masters 1/2

Faculty of Computational Mathematics and Cybernetics, Moscow State University

Classes: Wednesday, 18.30 - 20.00, room 606

M. Jordan: ... current focus on doing AI research via the gathering of data, the deployment of “deep learning” infrastructure, and the demonstration of systems that mimic certain narrowly-defined human skills — with little in the way of emerging explanatory principles — tends to deflect attention from major open problems in classical AI. These problems include the need to bring meaning and reasoning into systems that perform natural language processing, the need to infer and represent causality, the need to develop computationally-tractable representations of uncertainty and the need to develop systems that formulate and pursue long-term goals. These are classical goals in human-imitative AI, but in the current hubbub over the “AI revolution,” it is easy to forget that they are not yet solved.

We have to have error bars around all our predictions ... Otherwise it's gambling, and too many failed predictions can lead to big disappointment with Big Data - a Big Data Winter.

M. Brodie: Yet there is a potential Big Data Winter ahead if people blindly apply Big Data and more specifically machine learning.

Course overview

This is one term course, which provides a survey of the theory and application of methods to work with hypotheses and models in data intensive domains. Topics covered include overview of different approaches to hypotheses and models formulation, representation, tests, logic and probabilistic inference, model quality assessment. This course is part of a sequence of courses on Big Data track and is taught for 1st and 2nd year masters students.

Course outcomes

The main objective of this course is to overview hypothesis-driven approach and the skills needed to do empirical research in data-intensive domains
The course aims to provide students with techniques and receipts for applying statistical/probabilistic framework to assess quality of models
The course will also emphasize recent developments in hypothesis management and will present some open questions and areas of ongoing research

How students time is spent

2 hours per week - lectures
4 hour per week - homeworks

Assessment

40% - Final Oral Exam
30% - Class tests
30% - Homeworks

grade 5: 80 - 100%; 4: 60 - 79%; 3: 40 - 59%; <3: 0 - 39%.

Instructor

Dmitry Kovalev

Assistants

Course Materials

This repository contains lectures and homeworks for @hamc2019. It will be updated as the class progresses.

Week	Lecture notes	Supplementary materials	Homework	Tests
1	Introduction Hypothesis-driven approach	M. Jordan about AI revolution J. Gray. Fourth Paradigm M. Brodie. Understanding Data Science L. Kalinichenko. Methods and Tools for Hypothesis-Driven Research Support
2	Lecture 1	Course at KhanAcademy Limitations of CLT CI and hypotheses		test_1
3	Lecture 2		homework_1

zhekalat/hamc2019