Data science has sometimes been referred to as “data poetry.” It is the art of finding the story that your data tells and clearly conveying that story to others. This class will cover topics in machine learning and statistics necessary for applied research in modern psychology and neuroscience. Emphasis will be placed on fundamental data science theory that can support learning more complex analytical methods, as well as basic applied skills for performing data analysis in a research context.
Topics include (but are not limited to):
• Github and version control
• Juypter notebooks & markdown files
• Data organization & archiving
• Data visualization
• Linear regression models
• Data cleansing
• Reducible vs. irreducible error
• Logistic regression
• Linear/Quadratic discriminant analysis
• K-Nearest Neighbors
• Cross validation
• Bootstrapping
• Model selection
• LASSO & Ridge regression
• Overfitting
• Dimensionality reduction
• Decision trees
• Support vector machines
• Bayes factors
Lectures will focus heavily on theory while lab portions of the course will use the R statistical language to provide hands on data analysis experience. The goal of this class is to provide you with both the theoretical and practical knowledge for using modern data science tools in psychology research.