/CTA-ED-exercise1

Datafiles and Rmd script for exercise 1 (due on the 7th of February 2024)

Primary LanguageRCreative Commons Zero v1.0 UniversalCC0-1.0

CTA-ED-exercise1

In this repository you will find the data and Rmarkdown (Rmd) script for exercise 1 (to be completed for the 7th of February 2024). This is part of the course 'Computational Text Analysis' taught at the University of Edinburgh by Dr. Marion Lieutaud.

Starting instructions

Start by cloning this repository as shown in the class (click on the 'code' green button, then 'open with Github Desktop'. Once you have cloned the repository onto your laptop, open the .Rmd file 'CTA_exercise1_wordfreq.Rmd' in your RStudio, run the code chunk by chunk, making sure you understand every step. Then look at the exercise questions at the end of the .Rmd file. You will need to write and execute your own code to answer them. You should draw on the code presented in the exercise, the week 2 demo [https://marionlieutaud.github.io/CTA-ED/week-2-demo] and the livecoding scripts that we have developed in the tutorials (these are available on Learn, in the tutorial material section for each week). You can also look for help and inspiration in other online resources, including for example the quanteda tutorial page [https://quanteda.io] or quanteda quick start guide [https://quanteda.io/articles/quickstart] and by looking up problems or error messages on Stack Overflow [https://stackoverflow.com]. That's how programmers do it!

You could also use ChatGPT but you should know that its use of R is quite outdated; ChatGPT is (for now anyways) not great at coding, especially in R, and you will not learn much by copy-pasting from it. ChatGPT is better at explaining R code than at generating it, so that may be a more useful way to use it if you're keen on it.

As many of you already noticed, there are different ways to write code for the same operations (e.g. tokens() or unnest_tokens(). These do roughly the same thing, only the first is part of the 'quanteda' package and the second is part of the tidyverse). There is (almost) never a single command or method to do something in R (or in programming in general) so do not let that disorientate you; you can use whichever package and command work: there are many ways to get to the right answer!