This Wiki contains information for Social Data Science, Fall 2021, Rutgers University.
The most useful thing you can do to prepare for this course is to familiarize yourself with R.
Chris Bail at Duke has recorded a series of short videos introducing R. You can find a Twitter thread and a link to the materials here. These videos are intended to serve as an introduction to programming in R for a social science audience. They were created as a resource for participants in the Summer Institute in Computational Social Science without much prior experience. The last video also covers Github and RMarkdown (discussed below).
The main textbook we will be using this semester is R for Data Science (R4DS) by Hadley Wickham and Garrett Grolemund (which Bail also uses in the videos discussed above). We will work through most of this book over the first few weeks of the semester. I have indicated the relevant chapters each week. The textbook focuses on using R to work with data, drawing upon a set of packages known as the tidyverse, of which the first author Hadley Wickham is the lead developer. It is a ell organized and easy to follow introduction to the fundamentals of R. If you have the time, I would recommend starting to read (or skim) through the chapters listed in the syllabus and using RStudio to test out some of the examples (if you use the online version of the book you can easily copy over the code).
If you get stuck working on any problems I highly recommend searching for help on StackOverflow, an online community devoted to coding advice. There is sometimes a bit of an art to finding the right way to phrase a search regarding a coding problem but in many cases you can use an error message and quickly find an answer. You can also post your own questions on StackOverflow, but make sure to read up on the guidelines first.
We will be using RStudio for assignments in this class. You can download the free version of RStudio Desktop here. RStudio is an integrated development environment (IDE) that has a great deal of helpful functionality. You can use it for a range of tasks including to write and run code, find help, view plots, and examine data.
RMarkdown is a framework designed by the developers of RStudio which allows you to combine code, text, and other elements into the same document. RMarkdown can create slideshows and other documents. For example, the course syllabus and the slides I will be using in lecture are both created in RMarkdown. Course assignments (and potentially your final papers) will be written using RMarkdown. I recommend taking a look at this tutorial.
Github is a website used to store and share code. The most useful aspect of Github is version control. If you regularly store your code on Github then it is easy to keep a record of your work over time (similar to track changes in Word documents). All course materials will be hosted in a Github repository. You will be using Github to submit your course assignments.
You can sign up to Github here (no need to pay, just sign up for a free account). As students, you can get access to certain paid features for free by signing up for a student developer pack. Once you have made your account, follow the instructions here. I haven’t done this for a while but you will likely need to upload a copy of your student ID or some other information. They should provide you with access within a few days.
Once you have an account, this tutorial will introduce you to most of what you will need to know.
- Install RStudio
- Install RMarkdown and read tutorial
- Make Github account
- Apply for Github student developer account
- Follow Github tutorial