This course is all about variation, uncertainty, and randomness. Students will learn the vocabulary of uncertainty and the mathematical and computational tools to understand and describe it.
Thomas Stewart
Elson Building, 400 Brandon Ave, Room 156
thomas.stewart@virginia.edu
Github: thomasgstewart
Ethan Nelson
Graduate student in Data Science
ean8fr@virginia.edu
Github: eanelson01
Format of the class: In-class time will be a combination of lectures, group assignments, live coding, and student presentations. Please note: Circumstances may require the face-to-face portion of the class to be online.
Time: MWF, 10 - 10:50am, Dell 1 Room 105
Instructor Office Hours: MW, 11am, Dell 1 Commons (The instructor will leave if there are no questions after 15 minutes.)
TA Office Hours: Thursdays, 1pm, Dell 1 Commons
The following textbooks are freely available online via the UVA library.
Understanding uncertainty by Dennis V. Lindley
Understanding Probability, 3rd edition
by Henk Tijms
Introduction to Probability: Models and Applications
by N. Balakrishnan, Markos V. Koutras, Konstadinos G. Politis
The following textbooks may also be helpful.
Probability and Statistics for Data Science
by Norman Matloff
Introduction to Probability Models
by Sheldon M. Ross
The course will be taught using R.
The following are the four ideas that I hope will persist with students after the minutia of the Poisson distribution has faded from memory. Expand each section to see the associated learning outcomes and topics.
Probability is a framework for organizing beliefs; it is not a statement of what your beliefs should be.
Learning outcomes | Topics |
---|---|
compare and contrast different definitions of probability, illustrating differences with simple examples |
|
express the rules of probability verbally, mathematically, and computationally |
|
illustrate the rules of probability with examples | |
using long-run proportion definition of probability, derive the univariate rules of probability | |
organize/express bivariate random variables in cross tables | |
define joint, conditional, and marginal probabilities | |
identify joint, conditional, and marginal probabilities in cross tables | |
identify when a research question calls for a joint, conditional, or marginal probability | |
describe the connection between conditional probabilities and prediction | |
derive Bayes rule from cross tables | |
apply Bayes rules to answer research questions | |
determine if joint outcomes are independent | |
calculate a measure of association between joint outcomes | |
apply cross table framework to the special case of binary outcomes |
|
define/describe confounding variables |
|
list approaches for avoiding confounding |
|
Probability models are a powerful framework for describing and simplifying real world phenomena as a means of answering research questions.
Learning outcomes | Topics |
---|---|
list various data types | |
match each data type with probability models that may describe it |
|
discuss the degree to which models describe the underlying data | |
tease apart model fit and model utility | |
express probability models both mathematically, computationally, and graphically |
|
employ probability models (computationally and analytically) to answer research questions | |
explain and implement different approaches for fitting probability models from data |
|
visualize the uncertainty inherent in fitting probability models from data |
|
explore how to communicate uncertainty when constructing models and answering research questions |
|
propagate uncertainty in simulations | |
explore the trade-offs of model complexity and generalizability |
Probability is a framework for coherently updating beliefs based on new information and data.
Learning outcomes | Topics |
---|---|
select prior distributions which reflect personal belief |
|
implement bayesian updating | |
manipulate the posterior distribution to answer research questions |
Probability models can be expressed and applied mathematically and computationally.
Learning outcomes | Topics |
---|---|
use probability models to build simulations of complex real world processes to answer research questions |
Courses carrying a Data Science subject area use the following grading system: A, A-; B+, B, B-; C+, C, C-; D+, D, D-; F. The symbol W is used when a student officially drops a course before its completion or if the student withdraws from an academic program of the University.
Grading Scale:
- 93-100 A
- 90-92 A-
- 87-89 B+
- 83-86 B
- 80-82 B-
- 77-79 C+
- 73-76 C
- 70-72 C-
- <70 F
Grades will be a weighted average of the final exam score (30%), the midterm exams (each 15%), the deliverables (20%) and homeworks (20%).
Individual homeworks are graded with a score of 0, 1, or 2. After the initial grading, students may resubmit homework within one week of feedback for an additional point. That is, an initial score of 1 can be bumped up to a 2. Likewise, a 0 can be bumped up to a 1.
Deliverables are larger assignments than homework. To complete the deliverables, you will use probability models to build simulations of complex real world processes to answer questions. Deliverables are graded like homeworks, including the opportunity to resubmit for an additional point.
Midterm exams are graded on a 100 point scale. For midterm 1, if your grade on midterm 2 or the final is higher, the higher score will replace the score for midterm 1. Likewise, for midterm 2, if your grade on the final exam is higher, the higher score will replace the score for midterm 2. For example, suppose your exams scores for the midterms and final were 72, 88, 85. For the purposes of the final grade, your exam scores would be 88, 88, 85.
The final exam is Thursday, May 9 at 9:00am, as assigned by the university. Approximately one week prior to the exam, the instructor will provide a set of questions for which students will prepare solutions and written explanations. During the final exam period, the instructor will provide a supplementary set of questions related first. For example, the instructor may ask:
- Please explain how you solved a particular question in the initial set.
- Please solve a new question (perhaps closely related to a question in the initial set).
- Please explain course topic X.
Students will be graded on both the accuracy of their responses and the clarity with which they explain course concepts and solutions to questions.
Homeworks, deliverables, reading assignments, and exams will be posted on the course calendar below.
Mon | Tue | Wed | Thu | Fri |
---|---|---|---|---|
Jan |
17 Survey/Github setup |
19ReadingGet started guide |
||
22 |
24 Tools Reproducable Reports |
26 DUE: HW1 Reading(optional) First 5 videos of Learn R Programming(optional) Intro to VS Code (optional) Using Git with Visual Studio Code Note that you have already cloned your repo locally, whereas the video creates a fresh repo. |
||
29 DUE: HW2 Rstudio on Rivanna |
31 |
Feb |
2 DUE: HW3 ReadingUnderstanding uncertainty, CH 1 |
|
5 DUE: HW4 |
7 DUE: HW5 DUE: HW1 Resubmission Operating Characteristics |
9 DUE: HW6 DUE: HW2 Resubmission |
||
12 DUE: HW7 DUE: HW3 Resubmission Rules of prob 1 Rules of prob 2 |
14 Exam review Prep questions DUE: HW8 |
16ExamYou will be given a set of prep questions on Feb 14. Generate solutions to the prep questions prior to the in-class exam. During the exam, you will be given a test questions similar to the prep questions. You will be able to copy and paste and tweak your solutions to the prep questions to solve the exam questions. |
||
19 Read/Watch Deliverable 1 DUE: HW5 Resubmission |
21 Work on Deliverable 1 |
23 DUE Deliverable 1 HW6 Resubmission |
||
26 |
28 DUE: HW9 |
Mar DUE: HW10 Diagnostics |
||
4 Spring break |
6 Spring break |
8 Spring break |
||
11 In class: Deliverable 2 |
13 |
14 DUE: Deliverable 2 |
15 |
|
18 Data types DUE: HW11 DUE: HW 7 Resubmission |
20 DUE: HW 8 Resubmission |
22 HW 12 |
||
25 HW 13 |
27 Exam review Prep questions |
29ExamYou will be given a set of prep questions on Mar 27. Generate solutions to the prep questions prior to the in-class exam. During the exam, you will be given a test questions similar to the prep questions. You will be able to copy and paste and tweak your solutions to the prep questions to solve the exam questions. |
||
Apr In class code (Prob tom) Bernoulli (Binomial) Hands/Sequences |
3 |
5 Bernoulli sequences |
||
8 DUE: HW 12 Resubmission |
10 DUE: Deliverable 1 Resubmission |
12 DUE: HW 14 |
||
15 DUE: HW 11 Resubmission |
17 |
19 DUE: HW 15 |
||
22 DUE: Deliverable 2 Resubmission |
24 |
26 KDE KDE part 2 |
||
29 Last class Exam review |
May DUE: HW 13 Resubmission DUE: HW 14 Resubmission |
3 |
||
6 |
8 |
9 Final exam 9:00am - 12:00pm |
The instructor may alter the course content and grading policies during the semester.
Students are encouraged to study together. The instructions for each assignment/deliverable will indicate if and how students may work together. Students should not collaborate on midterm or final exams. Students that violate the collaborative-work policy on an assignment, deliverable, or exam will receive a score of 0 on the assignment, deliverable, or exam. Students may be referred to UVA Honor Committee.
University of Virginia Honor System. All work should be pledged in the spirit of the Honor System at the University of Virginia. The following pledge should be written out at the end of all quizzes, examinations, individual assignments, and papers: “I pledge that I have neither given nor received help on this examination (quiz, assignment, etc.)”. The pledge must be signed by the student. For more information, visit www.virginia.edu/honor.
UVA is committed to creating a learning environment that meets the needs of its diverse student body. If you anticipate or experience any barriers to learning in this course, please feel welcome to discuss your concerns with me. If you have a disability, or think you may have a disability, you may also want to meet with the Student Disability Access Center (SDAC), to request an official accommodation. You can find more information about SDAC, including how to apply online, through their website at www.studenthealth.virginia.edu/SDAC. If you have already been approved for accommodations through SDAC, please make sure to send me your accommodation letter and meet with me so we can develop an implementation plan together.