alan-turing-institute/rds-course

Module 2 Delivery (2023)

Opened this issue · 0 comments

Taught Material

Rough timings:

Section  Title  Start Time notes
Overview  13:05  
2.1.1 Where to find data
2.1.2 Legality and Ethics
2.1.3 Pandas intro 13:26 as in 2021 this is a rapid switch from legal/ethics issues to technical, could do with a pause/breather/smoother transition somehow
SHORT BREAK 13:48 (after ~10 mins of questions)  
2.1.4 Data sources & formats 14:00
2.1.5 Controlling access  
2.2.1 Data consistency up to null values
LONG BREAK 14:53 (after ~10 mins of questions)
2.2.1 Data consistency 15:15 from null values
2.2.2 Modifying columns & indices  
2.2.3 Feature engineering rushed through (from binning onwards
2.2.4.1 Time & Date rushed through 
2.2.4.2 Text Data rushed through
SHORT BREAK 16:02  
2.2.4.3 Categorical Data 16:15  
2.2.4.4 Image Data  
2.2.5 Privacy & Anonymisation  
2.2.6 Linking Datasets  
2.2.7 Missing Data  
  Wrap-up (final Qs, pre-reqs for hands-on)  
  End 17:02 (after ~10 mins of questions)  
  • Needed to leave more time for questions if we wanted to strictly keep to time
  • Did better than last time at covering all the material, but it's a lot of material to take in (and deliver)

Hands-on

  • One question at the end of the taught module about hoping to do more collaborative things on GitHub.
  • We gave people the option to go to whatever room they like but all except ~5 people went to the same room
    • few questions, not much talking/collaboration
    • but for people working in groups, tricky to manage people working at different speeds/at different abilities
    • maybe better to force creating smaller groups? But some people do just prefer to work independently on them.