Statistics learning and data analysis resources. Please, contribute and get in touch! See MDmisc notes for other programming and genomics-related notes.# Table of content
-
probability_cheatsheet
- A comprehensive 10-page probability cheatsheet that covers a semester's worth of introduction to probability. http://www.wzchen.com/probability-cheatsheetб https://github.com/wzchen/probability_cheatsheet -
linear_tests_cheat_sheet.pdf
- Common statistical tests are linear models (or: how to teach stats), https://lindeloev.github.io/tests-as-linear/ -
Resources for learning about the history of statistics and statisticians. By statisticians, for statisticians - references to blog posts, books, journal articles, podcasts, interviews, news, and other material about the history of statistics
-
Common statistical tests are linear models (or: how to teach stats)
-
awesome-bayes
- List of resources for bayesian inference. https://github.com/dimenwarper/awesome-bayespwdd -
bayesian-basics
- Bayesian data analysis introduction. https://m-clark.github.io/bayesian-basics/, https://github.com/m-clark/bayesian-basics -
brr
- "Biostatistics for Biomedical Research" by Frank Harrell, the creator ofHmisc
package and many more. https://github.com/harrelfe/bbr. Video lectures, https://www.youtube.com/channel/UC-o_ZZ0tuFUYn8e8rf-QURA -
BIOS2
- Biostatistics 621 / 821 course by Levi Waldron. Classical statistics, from all aspects of regression, survival analysis to dimensionality reduction basics. iPython and R. https://github.com/waldronlab/BIOS2 -
book
- a written companion for the Course 'Bayesian Statistics' from the Statistics with R specialization available on Coursera, https://github.com/StatsWithR/book -
book_sample
- Another Book on Data Science. Learn R and Python in Parallel. Web, https://www.anotherbookondatascience.com/, GitHub, https://github.com/rnorm/book_sample -
bysh_book
- Repo for Feb 2018 version of Broadening Your Statistical Horizons https://github.com/broadenyourstatisticalhorizons/bysh_book. The rendered version can be found at: https://bookdown.org/roback/bookdown-bysh/ -
CC-Linear-mixed-models
- Introduction to linear mixed models, https://ourcodingclub.github.io/tutorials/mixed-models/, https://github.com/ourcodingclub/CC-Linear-mixed-models -
CHE379
- Statistical refresher course by Chris A. Mack, From Data to Decisions: Measurement, Uncertainty, Analysis, and Modeling. Videos, esercises, slides in PDF. http://www.lithoguru.com/scientist/statistics/course.html. Video playlist https://www.youtube.com/playlist?list=PLM2eE_hI4gSDnF-mEa9mrIYx7GCLQVN89 -
DATA606Spring2017
- the DATA 606 course for the Spring 2017 semester by Jason Bryer. The course website is at data606.net. https://github.com/jbryer/DATA606Spring2017
-FES
- Feature Engineering and Selection: A Practical Approach for Predictive Models, by Max Kuhn and Kjell Johnson. http://www.feat.engineering/, [https://github.com/topepo/FES(https://github.com/topepo/FES)]
-
fiveMinuteStats
- A repo of short "vignettes" illustrating statistical concepts, https://stephens999.github.io/fiveMinuteStats/. https://github.com/stephens999/fiveMinuteStats.git -
Intro2R
- data mining and machine learning, https://github.com/johnros/Intro2R -
ISAT 251 Intro to Statistics with R
- basic statistics by Nicole Radziwill -
ISLR
- An Introduction to Statistical Learning with Applications in R (ISLR). The book, R code and the data are available at http://www-bcf.usc.edu/~gareth/ISL/. Videos and slides are at https://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/. Slides are also available at https://www.alsharif.info/iom530 -
numerical-linear-algebra
- Computational Linear Algebra for Coders. Course itself, http://www.fast.ai/2017/07/17/num-lin-alg/, video series, https://www.youtube.com/playlist?list=PLtmWHNX-gukIc92m1K0P6bIOnZb-mg0hY, git repository, https://github.com/fastai/numerical-linear-algebra -
Kalman-and-Bayesian-Filters-in-Python
- Kalman Filter book using Jupyter Notebook. Focuses on building intuition and experience, not formal proofs. Includes Kalman filters,extended Kalman filters, unscented Kalman filters, particle filters, and more. All exercises include solutions. https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python -
OpenIntro-Statistics
- An open-source textbook written at the college level. OpenIntro also offers a second college-level intro stat textbook and also a high school variant. https://www.openintro.org, https://github.com/OpenIntroOrg/openintro-statistics. Videos, https://www.youtube.com/user/bleue894/playlists, slides, https://github.com/OpenIntroStat/openintro-statistics-slides -
practicing_R
- R and statistics, https://github.com/johnros/practicing_R -
PractitionerGuidetoMultiplicity
- Practical Guide for Multiple testing, https://github.com/johnros/PractitionerGuidetoMultiplicity -
Probabilistic-Programming-and-Bayesian-Methods-for-Hackers
- An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python. https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers -
rethinking
- Statistical Rethinking course and book package, Richard McElreath. https://github.com/rmcelreath/rethinking. Andrew Gelman's note about the book, Video lectures. GitHub with slides and links to video lectures, https://github.com/rmcelreath/statrethinking_winter2019Statistical-Rethinking
- An interactive online reading of McElreath's "Statistical Rethinking: A Bayesian Course with Examples in R and Stan" by Levi Waldron. https://github.com/lwaldron/Statistical-Rethinking
-
stat540_2014
- STAT540 Statistical Methods for High Dimensional Biology course by Jenny Bryan -
stat401A
- conscise statistical refresher by Jarad Niemi, STAT 401A course at Iowa State University. https://github.com/jarad/stat401A. Jarad's web-site with more statistical courses STAT 544 and STAT 615 http://www.jarad.me/courses/ -
statcomp
- Statistical Computing, BIOS 735 - Introduction to Statistical Computing. http://biodatascience.github.io/statcomp, https://github.com/biodatascience/statcomp, https://github.com/biodatascience/statcomp_src -
thinkstats
- Statistical Thinking for the 21st Century. Book, http://statsthinking21.org/, and GitHub, https://github.com/psych10/thinkstats -
WinVector.github.io
- Various statistical topics with R and Python examples. "IntroductionToDataScience" course. https://winvector.github.io/ - web-facing pages, https://github.com/WinVector/WinVector.github.io/ - github repo -
www_stat_cmu_edu_cshalizi_350
- Statistics 36-350: Data Mining by Cosma Shalizi. http://www.stat.cmu.edu/~cshalizi/350/
-
Introduction to Causal Inference Fall 2020 course by Brady Neal. Video, lecture material. Twitter
-
BIOS 735 - Introduction to Statistical Computing - Statistical concepts in R, by Naim Rashid. GitHub
-
Bayesian Computing Course - Python notebooks with applied examples and explanations
-
Bayesian Data Analysis at Aalto (CS-E5710) course material, slides, video lectures, code demos, assignments https://github.com/avehtari/BDA_course_Aalto
-
CS229T/STATS231: Statistical Learning Theory, Stanford / Autumn 2018-2019. "Texts and References" section has a good set of course notes and links. https://web.stanford.edu/class/cs229t/
-
The Coursera Class: Statistics One, by Princeton https://github.com/svkerr/Statistics_Class_Princeton
-
Introduction to Statistics for Biologists, by Peter Ralph. https://github.com/petrelharp/bisc305
-
Data wrangling, exploration, and analysis with R, UBC STAT 545A and 547M courses. https://stat545-ubc.github.io/ and the Git repository https://github.com/STAT545-UBC/STAT545-UBC.github.io
-
Harvard CS 109: Data Science course with video lectures, pdf slides and iPython notebooks. Course overview, Videos, Git repository, All course-related material on Github
-
Advanced data analysis techniques, CVEN 6833, Dr. R. Balaji. Lots of material and links on regression and modeling techniques. http://civil.colorado.edu/~balajir/CVEN6833/
-
Econometrics Academy - main statistical methods, examples in R and SAS, short video tutorials
-
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018, by Gilbert Strang, https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
-
"A Student's Guide to Bayesian Statistics", by Ben Lambert https://www.youtube.com/playlist?list=PLwJRxp3blEvZ8AKMXOy0fc0cqT61GsKCG. More Bayesian Class Videos, https://discourse.mc-stan.org/t/bayesian-class-videos/3173
-
Statistical inference for data science, Brian Caffo. Full book on http://rpubs.com/cbchisanga/143127, videos on https://www.youtube.com/watch?v=WkOinijQmPU&list=PLpl-gQkQivXiBmGyzLrUjzsblmQsLtkzJ, GitHub version on https://github.com/bcaffo/LittleInferenceBook
-
Statistics 110: Probability, by Joe Blitzstein, https://www.youtube.com/playlist?list=PL2SOU6wwxB0uwwH80KTQ6ht66KWxbzTIo. The book "Introduction to Probability" https://twitter.com/stat110/status/1101502622358556674
-
An Introduction to Statistical Learning - classic stats learning book by Gareth James, Daniela Witten, Trevor Hastie, Rob Tibshirani
-
practical-statistics-for-data-scientists - Practical Statistics for Data Scientists, 50+ Essential Concepts Using R and Python, by Peter Bruce, Andrew Bruce, and Peter Gedeck. Code repository for O'Reilly book
-
Bayes Rules! An Introduction to Bayesian Modeling with R by Alicia A. Johnson, Miles Ott, Mine Dogucu. Tweet
-
Probabilistic Machine Learning - a book series by Kevin Murphy. GitHub repo with links to buy and PDF download
-
Forecasting: Principles and Practice by Rob J Hyndman and George Athanasopoulos, Monash University, Australia. R-based examples. Tweet
-
"Introduction to Applied Linear Algebra" by Stephen Boyd & Lieven Vandenberghe, http://vmls-book.stanford.edu/, vmls.pdf. Includes examples in Julia
-
"Advanced Statistical Computing" by Roger D. Peng. https://bookdown.org/rdpeng/advstatcomp/
-
"Modern Statistics for Modern Biology" book by Susan Holmes and Wolfgang Huber. Data and code provided. https://www.huber.embl.de/msmb/index.html
-
"Causal Inference Book" by Miguel Hernan and Jamie Robins. https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/
-
"A Bayesian Course with Examples in R and Stan" book sample, http://xcelab.net/rm/statistical-rethinking/. Link to the full video lectures on the topic, https://www.youtube.com/playlist?list=PLDcUM9US4XdMdZOhJWJJD4mDBMnbTWw_z
-
Notes and exercises for a classical free book "An Introduction to Statistical Learning" https://github.com/asadoughi/stat-learning
-
"From Algorithms to Z-Scores: Probabilistic and Statistical Modeling in Computer Science" book, http://heather.cs.ucdavis.edu/probstatbook by prof. Norm Matloff
-
"Introduction to probability" introductory book, published by the American Mathematical Society. Web-page and references therein, PDF.
-
"Modeling and Solving Linear Programming with R" open book introducting linear modeling, in R. Book, PDFm Git with corresponding code
-
"An introduction to psychometric theory with applications in R" by William Revelle, the creator of
psych
R package. The book with downloadable PDFs: http://personality-project.org/r/book/, the course based on the book: http://personality-project.org/revelle/syllabi/405.syllabus.html, and thepsych
R package: https://cran.r-project.org/web/packages/psych/index.html -
"Data Science Live Book" by Pablo Casas, from exploratory data analysis to regression/classification. https://livebook.datascienceheroes.com/
-
Advanced Data Analysis from an Elementary Point of View book by Cosma Rohilla Shalizi, PDF, 828 pages. Statistics, with R examples
-
How Linear Mixed Model Works by Nikolay Oskolkov
-
Introduction to linear mixed models - mixed and random effects, R syntax. GitHub
-
"A Brief Introduction to Graphical Models and Bayesian Networks" By Kevin Murphy, 1998. http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html
-
Bayesian Methods for Hackers - An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view
-
Bayesian Data Analysis demos for Python. https://github.com/avehtari/BDA_py_demos
-
A Beginner’s Guide to Eigenvectors, PCA, Covariance and Entropy. http://deeplearning4j.org/eigenvector
-
A Python tutorial on bayesian modeling techniques (PyMC3) https://github.com/markdregan/Bayesian-Modelling-in-Python
-
"Understanding Bayes" series of blog posts by Alex Etz. http://alexanderetz.com/understanding-bayes/. The other posts are also worth reading.
-
Goeman.pdf
- STATISTICAL METHODS FOR MICROARRAY DATA -
MPR04.pdf
- Introduction to Statistical Methods for Microarray Data Analysis -
How to Do Mediation Scientifically. https://blog.methodsconsultants.com/posts/how-to-do-mediation-scientifically/
-
Calculus resources, Python resources, Linear algebra resources by Brandon Rorher. Source