/INSPIRE_Intro2R

Introduction to R and RStudio for INSPIRE

Primary LanguageRMIT LicenseMIT

INSPIRE_Intro2R

How to use this repository

The scripts folder contains all of the R scripts that are used in the live Introduction to R session held by INSPIRE. You can refer back to these scripts to as you learn to use R.

This Readme file also contains a list of resources that you may find useful below. This list is not intended to be definitive or complete, but provide resources for the beginner and more advanced R users as you progress. There is a section at the bottom containing R resource lists made by others that contain many more resources and you may find them useful as you progress past the information included here.


Resources:

Learning R

Yes, you read that correctly. Packages for learning R... in R.

Data Cleaning/Transformation/Wrangling

These packages are used to manipulate, transform and clean up data. If you're a beginner, I would recommend starting with tidyverse, but I have included some advantages and disadvantages to guide you - for a more in depth comparison, take a look at this stackoverflow thread involving Hadley Wickham himself. If you don't know who Hadley Wickham is, get used to the name. He is the Chief Data Scientist at RStudio, he created the tidyverse package and has written many books on R/RStudio and his various packages - many of which are featured in the Learning Resources - Free Online Books section.

  • tidyverse package collection (more specifically the dplyr and associated packages)
    • Advantages: easily readable/comprehensible
    • Disadvantages: "tidyverse syntax" differs from base R syntax

OR

  • data.table
    • Advantages: very fast (good for very large datasets/"big data"), syntax is similar to base R
    • Disadvantages: not easily readable/comprehensible

Visualisations and Plots

  • ggplot2

OR

  • plotly

Tables

  • gt
  • xtable
  • huxtable
  • kable and kableExtra

Descriptive Statistics

  • gtsummary
  • hmisc
  • arsenal

Strings and Regular Expressions (Regex)

  • Stringr (included in tidyverse)

Dates and Times

  • Lubridate

Reproducible Research

  • here
  • rrtools
  • renv
  • groundhog

Bioinformatics/Genomics

Interactive Dashboards

Best Practices for Programming:

These two papers demonstrate high-impact ways to improve the reliabiliy and reproducibility of your programming. Some of these methods are commonly used in non-academic computer science roles and scientific computing is exactly that, it is at the cross-roads of science and computing. So, branching out and learning skills commonly used in computer science and non-academic programming roles will help to improve your code, and decrease the frequency and impact of mistakes that you may make along the way.

R Style Guides

A style guide provides a standardised way to write code - think of it as a dialect for your language. Pick a stylesheet and actually use it. If you're working within a team, they may already use a particular style, so it may be worth asking if they have a preference - that way everyone in the group will write similarly styled code.

Some reccomended stylesheets below:

OR

Learning Resources:

R Programming and Data Science

Blogs and Tutorials
Books

Data Vizualisations

Blogs and Tutorials

RMarkdown

Shiny

Shiny is a package used for creating interactive web dashboards. These can be particularly useful with datasets that are continuously growing or for interactive/customisable data visulisations. The shiny docs themselves are a good introduction, but recently a few books on building shiny apps have been released, all of which are mentioned below.

Blogs and Tutorials
Books

Geographic Information System (GIS)

GIS is used to make maps and vizualise geographic data. R can do this and I have used it to create simple UK county heatmaps before, but if you are getting into serious GIS territory, you may benefit from specific GIS software such as ArcGIS (paid-for, proprietary software, but a licence can be obtained through UoB software centre) or QGIS (free, open-source software). If you do insist on using R for GIS (as I did) then I've included some books below:

Statistics


Lists of R Resources by Others: