
Intensive 1-week introduction to text mining with Python

title: Text Mining Bootcamp
place: Roskilde University, Denamrk
time: August 14-18/2017,  9 AM to 3/4 PM.
instructors: Peter Leonard (Yale University Library) & Kristoffer L. Nielbo  (DIGHUMLAB)
contact: kln@cas.au.dk

[add description]


DAY 1: SWC|Programming with Python

Time Content Instructor
09:00-09:30 Welcome & Setup KLN
09:30-10:00 Analyzing Tabular Data KLN
10:00-10:30 Repeating Actions with Loops KLN
10:30-11:00 Storing Multiple Values in Lists KLN
11:00-11:30 Analyzing Data from Multiple Files KLN
11:30-12:00 Making Choices KLN
12:00-13:00 Lunch
13:00-13:30 Creating Functions KLN
13:30-14:00 Errors and Exceptions KLN
14:00-14:30 Defensive Programming KLN
14:30-15:00 Debugging KLN
15:00-15:30 Command-Line Programs KLN
15:30-16:00 Finish

DAY 2: DIGHUMLAB|From Print to Probability

Time Content Instructor
09:00-09:30 Welcome ins
09:30-10:00 Reading Unstructured Data ins
10:00-10:30 Cleaning & Segmentation ins
10:30-11:00 Free Play ins
11:00-11:30 Language Normalization ins
11:30-12:00 Term Frequencies ins
12:00-13:00 Lunch
13:00-13:30 Dispersion and Distributions ins
13:30-14:00 Vector Space Representations ins
14:00-14:30 Project hour ins
14:30-15:00 Project hour ins

DAY 3: Density, Information and Entities

Time Content Instructor
09:00-09:30 Welcome KLN
09:30-10:00 Lexical Density KLN
10:00-10:30 Readability KLN
10:30-11:00 Free Play KLN
11:00-11:30 Information KLN
11:30-12:00 Time Series KLN
12:00-13:00 Lunch
13:00-13:30 Sentiment vectors KLN
13:30-14:00 Entity Extraction KLN
14:00-14:30 Project hour
14:30-15:00 Project hour

DAY 4: Latent Variables and (Multiple) Relations

Time Content Instructor
09:00-09:30 Welcome PL
09:30-10:00 `` PL
10:00-10:30 `` PL
10:30-11:00 `` PL
11:00-11:30 `` PL
11:30-12:00 `` PL
12:00-13:00 Lunch
13:00-13:30 `` PL
13:30-14:00 `` PL
14:00-14:30 Project hour PL
14:30-15:00 Project hour PL

DAY 5: Associations and Classification
topics: word embedding, document similarity and classification

Time Content Instructor
09:00-09:30 Welcome ins
09:30-10:00 `` ins
10:00-10:30 `` ins
10:30-11:00 `` ins
11:00-11:30 `` ins
11:30-12:00 `` ins
12:00-13:00 Lunch
13:00-13:30 `` ins
13:30-14:00 `` ins
14:00-14:30 `` ins
14:30-15:00 Finish ins