/CTA-ED-exercise3

Repository of material for exercise 3 of course "Computational Text Analysis"

Primary LanguageHTMLCreative Commons Zero v1.0 UniversalCC0-1.0

CTA-ED-exercise3

In this repository you will find the material for exercise 3 (to be completed for the 28th of February 2024). This is part of the course 'Computational Text Analysis' taught at the University of Edinburgh by Dr. Marion Lieutaud. There is no datafile in this repository as the code in the Rmd script will show you how to download it directly.

Starting instructions

This is an individual exercise. Start by forking this repository as shown in the class (click on the 'fork' tab towards the top right). You can then clone the forked repository (using Github Desktop as we did last week) so you can work on the R code on your own laptop. Once you're happy with your code on R, save your edits, knit your code, then go to Github Desktop, click 'commit to main' and then 'push origin'. This will push your commits onto the online forked repository that I can also see. Don't forget to click on 'push origin' or it won't work.

The exercise questions are at the end of the .Rmd file.

To do this exercise, you should draw on the code presented in the exercise, the demos, previous exercises (the answers for exercise 2 are in the exercise 2 repo) and the livecoding scripts that we have developed in the tutorials (these are available on Learn, in the tutorial material section for each week). You can also look for help and inspiration in other online resources, including for example the quanteda tutorial page [https://quanteda.io], the quanteda quick start guide [https://quanteda.io/articles/quickstart], and Stack Overflow [https://stackoverflow.com]. You may find Regex Cheatsheets (e.g. [https://hypebright.nl/index.php/en/2020/05/25/ultimate-cheatsheet-for-regex-in-r-2/] and the Stringr package cheatsheet (for string detection) [https://raw.githubusercontent.com/rstudio/cheatsheets/main/strings.pdf] particularly useful.

Make sure to comment your code as you go, ideally line-by-line, at the very least chunk by chunk. This will help you remember why you wrote your code the way you did, what the code was meant to do, and it will help you explain your code to the class - which you may be asked to do.

On top of writing the code, you need to write out the interpretation of the analyses you ran. At a minimum, this means a couple of lines per question.