/GBE3072-41_Biostatistics-and-Big-data-II_2020Fall

Course materials for 2020 Fall "Biostats and Bigdata II"

Primary LanguageMATLABGNU General Public License v3.0GPL-3.0

2020 Fall "Biostats and Big Data 2" at SKKU GBME



Download

You can download the class materials using the following command line.

$ git clone https://github.com/wanirepo/Stats2_2020Fall

Once you clone the github repository, you can just type the following command to get the updated github repository.

$ git pull

Or you can download the repository as a zip file or you can also use GitHub Desktop. The class materials will be uploaded (e.g., lecture slides, assignments) before each class.

There is a good github tutorial: https://rogerdudler.github.io/git-guide/index.html

What are the aims of this course?

Data are everywhere. It is a big data era. Data science already became a key element in many research and industrial areas. The primary aim of this course is to learn basic concepts and skills for data analysis, including concepts of random variables, sampling distributions, hypothesis testing, linear modeling, data visualization, etc., preparing students for their lives after the graduation in this data-are-everywhere age. This class, “Biostats and Big data 2”, is the advanced version of the “Biostats and Big data” course. You will also get some hands-on experience and in-depth study materials for statistics. If you did not finish the “Biostats and Big data”, I recommend you attending the class first.

Course format (flipped classroom) and expectations

This class uses a flipped classroom format, which is a new way of teaching and learning. Different from the traditional learning environment (passively listening to the lecture in the class and doing homework at home), in the flipped classroom, you will listen to the lecture at home and do homework and practices in the classroom. I personally experienced this format of learning during my PhD (for the Machine Learning class) and deeply enjoyed it. I found the flipped classroom helped students stay engaged and provided a good environment for hands-on experience. For these reasons, I have wanted to do the class with the flipped classroom format.

This semester is the first semester I open this class. Therefore, class materials are not going to be perfect, and I might not be able to provide videos for all weeks. Please consider you as a co-creator of this class. We will make and shape the class together.

Potential impacts in the course format due to the Covid-19

As you are all aware of, all the classes have been impacted by Covid-19. I will try to open this class as off-line, if possible, but it really depends on the number of students. I will keep you posted!

Textbooks

Main textbook:

"Stats: Data and Models" by De Veaux, Velleman, and Bock
"Statistical Thinking for the 21st Century" by Russell Poldrack Link

Softwares

I will use Matlab, R, and two free software packages for statistical analysis, JAMOVI and JASP. You can download Matlab through SKKU. R is an open-source programming language.

TAs

  • TBA

Evaluation

Absolute evaluation will be used for this course.

  1. Attendance (30%)
  2. Participation (including pop-questions) (20%)
  3. Final exam (25%)
  4. Homework (25%)

Schedule (TBA)

L: livestream R: recording

Week Video lectures Class Playlist link Homework assignment
Week 1
9/1 Overview
9/3 Changes in the class format
Week 2
9/8 (Livestream) Softwares and programming languages
9/10 (Recording) Target research articles and datasets
Week 3
9/15 (L) JAMOVI-Getting started Intro_JAMOVI
9/17 (R) Wrangling data I Data Wrangling
Week 4
9/22 (L) Wrangling data II, Data Exploration I Data Exploration
9/24 (R) Data Exploration II
Week
9/29 추석
10/1 추석
Week 5
10/6 (L) Data Exploration with the real data
10/8 (R) Sampling (standard error of the mean, confidence interval)
Week 6
10/13 (L) One sample t-test jamovi t-test HW1 due: 10/23
10/15 (R) Paired t-test, two-sample t-test
Week 7
10/20 (L) T-test on the real data
10/22 (R) Resampling (bootstrap and permutation)
Week 8
10/27 mid-term
10/29 mid-term
Week 9
11/3 (L) Chi-square test jamovi tests for counts
11/5 (R) ANOVA-review jamovi ANOVA HW2 due: 11/13
Week 10
11/10 (L) ANOVA - basics
11/12 (R) ANOVA - advanced
Week 11
11/17* (R) Correlation, Multiple regression (basics) - basic stats, R-squared jamovi Regression
11/19 (R) Multiple regression (advanced 1) - Dummy coding, etc.
Week 12
11/24 Multiple regression (advanced 2) - hierarchical, additional R-squared HW3 due: 12/04
11/26 Cross-validation and independent model testing
Week 13
12/1 Multivariate methods 1 (PCA)
12/3 Multivariate methods 2 (PLS)
Week 14
12/8 Multivariate methods 3 (or network modeling)
12/10 Review
Week 15
12/15 final
12/17 final

Note.

  • Potentially make-up due to a conference