R
R is a programming language used by some practitioners of data journalism, typically for data analysis and sometimes for visualisation. It isn't the only programming language used in this way: Python or JavaScript can both perform similar tasks, and then there's SQL, Ruby, PHP and plain old Excel. But no one expects you to know them all.
Python, JavaScript and R all have their strengths and weaknesses. And if you already know one or both of those you should be able to understand R relatively quickly (although there's a useful explanation of R's quirks for programmers here).
This repo contains a bunch of guides and resources to help get started with R, or tackle particular problems. It includes:
- 10 things to do first in R explains some key processes to get started with R. By the end you should be able to import data, analyse it, and save it as a new file on your computer.
- 10 things to do next in R introduces packages in R: libraries of commands for particular problems. It also covers combining datasets, and importing and cleaning sheets from Excel workbooks.
- After that you might want to get your head around how R works more generally. This page explains different types of 'objects' in R, from vectors (a type of list) and data frames (tables) to R's names for strings of characters, numbers and TRUE/FALSE values.
- 10 pretty pictures to make in R introduces some of the charting/visualisation capabilities of R.
- 10 Excel things in R explains how you can reproduce pivot tables in R, and why you might want to, as well as other techniques like accessing cells and reproducing COUNTIF.
- Once you're working on projects and need to think about collaboration, saving code and other aspects of project management, read this page on R project files, R scripts, and notebooks in R markdown
In addition there are lots and lots of resources to help you get to grips with R. If you want a general tutorial on using R, try some of the following:
- This YouTube playlist is based on a Coursera course on R. It has 12 videos that cover what R is used for, installing it, data types and some functions.
- This YouTube playlist goes into more detail with 24 lessons covering installation and data types through to functions, packages and visualisation.
- There's a list of free e-books on R on r-dir here.
- I maintain bookmarks of MOOCs (online courses) about R here. It's worth searching for others on sites like Coursera.
- .Rddj is a list of resources specifically for data journalism with R
- This video outlines how data journalism outfit FiveThirtyEight use R
- The New York Times' Amanda Cox talks about how they use R in this excellent Data Stories podcast
- Marie-Louise Timcke, a data trainee at Berliner Morgenpost’s Interactive Team, wrote about "A typical ddj workflow in R" in the team
- Rob Grant from Trinity Mirror's data unit has created R for Journalists
Once you understand the basics, however, remember that the best way to learn R is to pick a project, work out what you'll need to do to complete it - and then search online for those techniques in R. Sites like R-Bloggers have lots of posts explaining different techniques to solve different problems.
In a presentation at Hacks/Hackers Birmingham, Andy Pryke also gave these tips: