The FAHA institute provides online and in-person education aimed at a broad range of humanities scholars. Participants will gain a theoretical and practical understanding of text analysis methods, and will learn how to extract content and derive meaning from digital sources, enabling new humanities scholarship.
This repository contains code and links to instructional videos for the summer workshop.
Notebook | Colab | Description |
---|---|---|
quick_start.ipynb |
An introduction to Google Colab. This notebook also demonstrates how to import our data. | |
counting_words.ipynb |
This Notebook covers some basics of processing text with Python. It invites readers to count words and visualize their results while thinking critically about how the way in which we process text can impact analysis. | |
this_is_not_a_string.ipynb |
This Notebook provides a quick walkthrough of data structures, data types, and common errors. The purpose of this Notebook is to help cultivate an awareness of how our computer processes digital data compared to how we might perceive that same data. | |
word_emebeddings.ipynb |
Word embeddings can provide insight into different dimensions of a corpus. Here we use word embeddings to view, at scale, which words are most associated with one another and how these associations changed over time. (see: "Text Mining as Historical Method" for the original version.) | |
topic_modeling.ipynb |
Code for modeling topics. |
- Click the green "code" button (top right corner) and "Download Zip"
Or
- Clone the repository via terminal:
git clone https://github.com/stephbuon/faha.git
Hansard:
- Hansard Sentences with Woman
- Hansard 19th-Century
- Hansard 1980s
- Hansard 1990s
- Hansard 2000
- Hansard 2010
Congress:
Reddit:
Loudoun County School Board Minutes
- Jo Guldi, "Digital History"
- Lauren Klein, "Quantitative Literary Analysis: Theory and Practice"
- Amanda Regan, "Digial Methods for History (Worksheets)"
- Amanda Regan, "Digial Methods for History (Data)"
- rOpenGov
Buongiorno, Steph, Robert Kalescky, Omar Alexander Cerpa, and Jo Guldi. "The Hansard 19th-Century British Parliamentary Debates with Improved Speaker Names: Parsed Debates, N-Gram Counts, Special Vocabulary, Collocates, and Topics", https://doi.org/10.7910/DVN/ZCYJH8, Harvard Dataverse, V1, 2022, UNF:6:wFlN6+URD9Q9BWYxgZgu1A== [fileUNF]