-
Econometrics (Hanson 2018) - Great introduction to graduate econometrics [pdf]
-
Econometric Analysis of Cross Section and Panel Data, Second Edition (Wooldridge 2010) - Standard reference that should be on every shelf [Book Description]
-
Regression Modeling Strategies (Harrell 2001) - The first three chapters are required reading -- Frank Harrell knows his statistics. [Book Description]
-
Applied Nonparametric Econometrics (Henderson and Parmeter 2015) - Start to finish nonparametric econometrics with applications and R code [Book Website] [Personal Bookdown Notes]
-
Introduction to Statistical Learning (James et al. 2017) - Perfect introduction to statistical learning and predictions [Book Website] [pdf] [Personal Notes] [Python Code]
-
(In Progress) Fluent Python (Ramalho 2015) - [Book Website]
-
(In Progress) The Elements of Statistical Learning (Hastie et al. 2009) - [Book Website] [pdf]
-
(In Progress) Hands on Machine Learning with Scikit-Learn and TensorFlow (Geron 2017) - [Book Description] [Personal Notes] [Github]
-
Introduction to Data Science with R (O'Reilly 2014) [Course Website]
-
- Introduction to Python
- Intermediate Python for Data Science
- Python Data Science Toolbox (Part 1)
- Cleaning Data in Python
- Supervised Learning with scikit-learn
- Deep Learning in Python
- Extreme Gradient Boosting with XGBoost
- Intro to SQL for Data Science
- Data Camp Projects
- Where Are the Fishes - Explore acoustic backscatter data to find fish in the U.S. Atlantic Ocean.
- Exploring the evolution of Linus - Find out about the development of the Linux operating system by exploring its Git repository history.
- Dr. Semmelweis and the Discovery of Handwashing - Reanalyse the data behind one of the most important discoveries of modern medicine: Handwashing.
I find the best way to learn a specific algorithm or statistical model is to build one from scratch. The following files are classes and functions that accomplish the most common statistical learning methods on a limited level.
-
Linear Regression (Gradient Descent): LinearRegression_GD.py
-
Logistic Regression (Gradient Descent): LogisticRegression_GD.py
-
Decision Tree: DecisionTree.py
-
Random Forest: RandomForest.py
-
KNN: KNN.py
-
SVM: SVM.py
-
PCA: PCA.py
-
Neural Network: NeuralNetwork.py
- Keywords(R, Python, Statistical Modeling, Algorithms)
-
Builds daily gridded weather data for the continental United States from 1900-2013.
-
Relative anomaly spline interpolation technique calculates daily weather data for 460,000 2.5km x 2.5km grids in the US. [Tech. Example]
-
Aggregates down to county level weather data.
-
Keywords(R, Economics, Climate Change, Weather)
Nonlinear Temperature Distributions [R package] [Python Package]
-
Calcuate nonlinear temperature distributions degree days and time in each degree.
-
Measure accounts for the rise and fall of temperatures during the day.
-
Degree days define time above a specified temperature threshold (e.g. degree days above 30C) and time in each degree define time within a specified temperature threshold (e.g. time in 30C).
-
Keywords(R, Python, Economics, Climate Change, Agronomy)
-
Predict wine quality based on biophysical characteristics.
-
Model using Multinomial logit, Linear Discriminant Analysis, Random Forest, and Extreme Gradient Boosting
-
Keywords(R, Classification, Economics)