This repository contains information on the overall Capstone sequence and material for the lecture component of the course.
The course materials for each domain of inquiry is maintained by the domain expert. Links to materials for each domain may be found below, otherwise contact the section leader for your domain of choice.
- Wikipedia Edit Wars (Roberts)
- Quantitative Measure of Artistic Style (Twomey)
- Fair Policing and Predictive Policing (Fraenkel)
- Clustering the Human Genome (Ellis)
- Malware and Graph Embeddings (Fraenkel)
Lecture is held on Mondays at two different times, in the same location:
- Monday 9:00am - 9:50am, CENTR 222 (A00)
- Monday 10:00am - 10:50am, CENTR 222 (B00)
You must attend the discussion corresponding to your chosen domain of inquiry. Attendence is mandatory.
Section | Time | Location | Title |
---|---|---|---|
Discussion A01 | W 9am-9:50am | CENTR 207 | Quantitative Measurement of Artistic Style |
Discussion A02 | W 9am-9:50a | WLH 2113 | Wikipedia Edit Wars |
Discussion A03 | W 9am-9:50a | SDSC E145 | Fair Policing and Predictive Policing |
Discussion B01 | W 10am-10:50a | CENTR 207 | Clustering the Human Genome |
Discussion B02 | W 10am-10:50am, | WLH 220 | Malware and Graph Embeddings |
Lab hours are for one-on-one help with both domain experts and methodological experts. Unless separately scheduled with domain experts, lab hours are held Fridays 9:00 - 10:50 in the CSE Basement (B250).
The syllabus for the course may be found here.
Week | Topic: Methodology | Topic: Domain |
---|---|---|
1 | Introduction | Intro to domain problem |
2 | Anatomy of a DS project | Data generating process (context) |
3 | Handling data | Description of data |
4 | Version control | Domain specific techniques I |
5 | Workflow patterns I | Domain specific techniques II |
6 | Workflow patterns II | Discussion of main result |
7 | Version control and data | Standards for evaluation in domain |
8 | Environment independence | Impacts and ethics |
9 | Advanced data handling | Related questions in domain |
10 | Multilingual workflows | Project proposals |
You are welcome to develop your work on your own computer, however DataHub is available for your use as well. These servers at least as large as your laptop and you can use them either as Jupyter Servers, as well as via a command-line interface. As the quarters progress, they may be provisioned for more memory intensive jobs.