R code solutions for common computational biology tasks

Summary

A repository of code examples (with explanation) on real world data files from computational biology using the data.table and ggplot2 R packages. I think data.table is perfect for computational biology tasks of all sizes. I also think ggplot2 is awesome for building intuitive visualisations of your data. Here I provide working code examples of how I do useful tasks with computational biology data using data.table and ggplot2. This is in no way intended to be 'best-practice' or 'expert' code, these are example solutions to common tasks for people new to R and data.table. I have deliberately simplified some examples to aid clarity.

How to use

This GitHub repository is an Rstudio project, download it all as a zip or clone it using 'git' on the command line.

Steps to success

  1. Start with introduction.R
  2. Have a look at the different tasks, each one is a separate R code file.
  3. Have a go at completing the code for the student tasks (1-4).
  4. Maybe also look at the basics of data.table file, for more detail on the "how" and "why" and the "special" variables of data.table.
  5. Write your own code for the 'contributed' section and submit a pull request.

December 2023 - status

  • We are recording YouTube video companions for each task, video link is available in the R source code for each task - these videos are currently unlisted.
  • Videos are being migrated to new YouTube Channel

March 2022 - Status

  • First release of the student tasks 1-3 written.
  • First release of the introduction and basics code are written.
  • First release of tasks 1-5 are written.
  • All other tasks are just placeholders.