what | where | when |
---|---|---|
Computing for Human(s|ists) |
University of Victoria, BC |
June 8--12, 2015 |
This course is intended for humanities-based researchers with no programming background whatsoever who would like to understand how programs work behind the scenes by writing some simple but useful programs of their own. Over the week the emphasis will be on understanding how computer programmers think so that participants will be able to at least participate in high-level conceptual discussions in the future with more confidence. These general concepts will be reinforced and illustrated with hands-on development of simple programs that can be used to help with text-based research and analysis right away. The language used for most of the course will be Python because of its gentle syntax and powerful extensions. Using the command-line interface and regular expressions will also be emphasized. We will also spend some time taking glimpses at what is happening in the other DHSI courses to understand how reading and writing programming code goes well beyond what we touch on in this class.
Visit our class forum here.
-
Welcome (John & Dennis)
- Introductions
- Overview of week
- Course Philosophy
- Why command line?
- Why Python?
-
Working Demo: Introduction to Terminal (John)
- Cheatsheet
- Regex
- Simple problems using the cheatsheet
- More CLI basics.
- Lab: Hunting the Whale. (Dennis)
Intro to the Terminal. Terminal in the morning via a cheatsheet, a bit of a live demo, and then some problems that they can use the cheatsheet to solve. Use of lab activities in the afternoon that will push further into text manipulation in a Unix environment that is akin to what they might actually do with materials.
- Text Manipulation at the Command Line w/Dennis
- Exercise 1: Automate Moby
- Anatomy of a Bash program
- Python 1
- Python 2
- Python 3
When to use Bash? | When to use Python |
---|---|
- automate daily tasks | - data science |
- manage files & folders | - app development |
- remote server admin | - NLTK |
- data munging | - data visualization |
- quick & dirty text manipulation | - glue code |
- everything else |
- Building the Zodiac w/ John
Light lecture in the morning that builds on experiences the day before, focusing on the mindset of a programmer and important high-level programming concepts. Following this will be a small set of activities using python solely in the terminal to give a sense of how these concepts are implemented generally. The second part of the morning will include a live coding demo using python in the terminal and a text editor in a separate window to show how to build an simple tool (Previously this was a Chinese zodiac symbol that participants will be able to follow along with. See the zodiac folder for these steps). This provides two essential things: observation of an actual coding process and a set of templates that they can draw on for the rest of the course (and afterwards).
The afternoon will have them carrying out a lab assignment to further hone their skills.
- Building the Zodiac II
- Project brainstorm
-
Guest Speaker (reflections from DHSI 2014 class graduate)
-
Two approaches to coding environments.
IDE = interpreter + text editor + file browser
Two approaches:
-
all rolled into one (PyCharm)
-
ipython (interpreter) + vim (text editor) + command line (manage files)
There is no wrong answer here. Chose tools that are FOSS, universal, extensible.
-
A day in the life. What can be done with Python?
- LitClock Twitter Bot (project page, twitter account, code, data)
- Science Surveyor
- HuViz- Alpha of the Orlando/CWRC Graph / Ontology Viewer Old Bailey Data Warehouse Interface, a tool prototype for mining a copy of the Old Bailey database held on a special server http://analytics.artsrn.ualberta.ca/digging2data/
-
1hr write a short 1-2 paragraph description of your project. Concentrate on the goals of the what you are trying to accomplish, not the technical details. Spend some time discussing what tools and datasets you would need for the project. For example a simple project description may be:
Using Python NLTK, our group would like to build an "essay grader" which would take as its input a sample essay and output a score, based on several parameters like sentence length variation and richness of vocabulary.
-
1hr translate or "formalize" your goals into a series of step by step instructions in pseudocode.
-
NLTK mini tutorial, CSV and RegEx mini tutorials
-
project work for the rest of the day
9:30 - 11:30am
Reevaluate the scope of your project. Cut out inessential functionality. We are trying to get to a "minimally viable" prototype stage. Take notes via code comments throughout.
11:3 - noon
Concluding remarks. Showcase and Plenary meeting after.
reads a provided CSV file containing the names, authors, and URLs of film scripts from an online database, downloads each of them and names the file according to the AuthorName_FilmTitle.html conventionhh
this program takes an HTML file and extracts only the part containing the film script then strips out all HTML code and saves it as a .txt file