/ida

An introduction to data analysis, using R. Experimental.

Primary LanguageR

README

Ivaylo Petev and myself use this repository to teach an undergraduate introduction to data analysis. The course is online.

If you are reading the course on its online pages, just replace the .html extension of a page by .R to download the underlying code.

HOWTO

The course pages are formatted in R Markdown syntax and were converted to HTML with knitr 1.4:

install.packages("knitr")
citation("knitr")

The knitting routine is in the .Rprofile. To compile the whole course, set the IDA folder as your working directory and then type ida.build() (takes a bit more than five minutes on optic fiber).

Other files are called from the code/ and data/ folders. Most datasets are downloaded on the fly if they are missing from the data/ folder, so make sure that you are online while running the scripts.

The whole course was coded and taught with RStudio. The code was ran on R 2.15.2, 2.15.3, 3.0.0 and 3.0.1, on a MacBook Air running OS X 10.8 and Mac OS X 10.9. Most plots use ggplot2 version 0.9.3.1 (just in case compatibility breaks at some point).

CREDITS

Thanks to the Sciences Po Reims staff, who offered invaluable support, and to the small group of students who enrolled in (and survived to) the course. The R-2013-Lyon slides have a bit more detail on the practicals.

Bits and pieces of the code were posted to Gist, RPubs and Stack Overflow during development. Thanks to the great R developer and user communities that live online, and which we are now proud to count ourselves in.

If you share the spirit of all this, you should consider joining the Foundation for Open Access Statistics and check out places like OpenCPU, the Open Knowledge Foundation and other initiatives in open access, open data, open source and open science.

HISTORY

Aug 2013: better data management, with large or multiple-file datasets read from ZIP archives. Switched datasets to .csv thanks to GitHub.

Jul 2013: typos and broken links. Removed some functions in .Rprofile that are now part of the questionr package.

Jun-2013: first draft. Everything kind of works, Sessions 5--7 are unlisted, the code/ folder contains a few more exercises. That's it for now!

May-2013: added more course content and better resolution (100dpi) for all plots.

Apr-2013: added a lot of course content and cleaner plots. Also adding the R-2013-Lyon folder for a conference presentation on the course.

Mar-2013: reviewed course structure: less files, more code, tons of new examples and exercises.

Feb-2013: more efficient .Rprofile functions and improved knitr routine, tidier code on the early sessions.

Jan-2013: first release.

First release: January 2013.
Last revised: August 2013.