The objective of this summer research is to create an R package that allows user to visualize "all" visualizable linear models, with ease.
A GitHub repository which contains the following:
-
An R package for the linear model visualization functions.
-
A log of hours spent on the summer research by each student, which includes date, hours, and activity summary.
-
A presentation to the Statistics Department.
-
A presentation at the CSM annual research conference.
-
A manuscript to submit to the R Journal detailing the work and submit an abstract to the RStudio Conference.
1. Utilize GitHub to collaborate on project materials and updates.
-
Karl Broman's github tutorial
-
Jenny Bryan's Happy git with R.
-
Also check out using version control with RStudio and this video on Git and RStudio.
-
Introduction to R projects
2. Adhere to good programming practices.
-
Write all R code according to Hadley Wickam's Style Guide.
-
Use the tidyverse style guide for an additional reference.
-
Use Hadley Wickham's R for Data Science book as a reference (Ch19 also discusses functions).
3. Create an R package that contains visualization functions. At a minimum, this should be downloadable through devtools; as time allows, consider putting it on CRAN.
-
Hillary Parker's blog post Writing an R package from scratch
-
Hadley Wickham's R packages book.
-
RStudio's video on Package writing in RStudio.
4. Provide documentation for the R package.
-
Use the roxygen package to document code.
-
Write a vignette to accompany the package.
-
Consider using pkgdown to create a website.
5. Review existing R packages and functionality for linear model visualization.
-
geom_smooth() in ggplot2?
-
How do you plot a regression line in plotly?
-
Many approaches out there just plot predictions of fitted regression model. Find and review these!
6. Create functions (or identify best existing ones) for items that are not currently easy to achieve in R. (Make sure that these cannot be accomplished in the existing R packages).
-
Simple linear regression line
-
Regression model with one categorical variable and one quantitative variable (parallel lines model)
-
Regression model with one categorical variable, one quantitative variable, and the interaction term (different intercepts and slopes)
-
Regression model with one quantitative variable and the interaction with a categorical variable (different slopes, but common intercept)
-
Can these all accommodate polynomial models that include higher order terms of the one quantitative variable?