/s4_lectures

Primary LanguageJupyter Notebook

Science of Science Summer School (S4) Lectures

Introduction

These are the lecture slides, notebooks, datasets, and exercises for for the Science of Science Summer School (S4) 2021 (https://s4.scienceofscience.org).

S4 2021 is hosted by the School of Information Studies at Syracuse University and organized by Daniel Acuna and Stephen David.

Structure

Day 1: Introduction

  • Presentation: Opening and introduction to science of science
  • Teaching #1 Introduction to environment: Jupyterhub, Notebooks, GitHub repository
    • Activity #1: Login into the system, run a notebook, save, and submit through nbgrader
  • Teaching #2: explore datasets (MAG sample), funding (Ying Ding), mentorship (Qing Ke), content (pubmed open access), images (pubmed open access)
    • Activity #2: simple computation of citations, funding across years, mentorship, text, and images
  • Teaching #3: Introduction to Python: basic principles, loading libraries, debugging
    • Activity #3: run simple program on Python, load data into Pandas, and run a simple regression

Day 2: Machine learning and artificial intelligence

  • Presentation: Overview of machine learning in science of science
  • Teaching #1: Probability, statistics, learning, errors, functions
    • Activity #1: Different kinds of learning and functions
  • Teaching #2: Model complexity and interpretability
    • Activity #2: show how to overfit, underfit, bias-variance tradeoff
  • Teaching #3: unsupervised learning, semi-supervised learning
    • Activity #3: dimensionality reduction, NLP, reinforcement learning
  • Teaching #4: TBA
    • Activity #4: TBA

Day 3: Network science

  • Presentation: Overview of Network Science
  • Teaching #1
    • Activity #1
  • Teaching #2
    • Activity #2
  • Teaching #3
    • Activity #3
  • Teaching #4
    • Activity #4

Day 4: Deep learning

  • Presentation: Lucy Wang from Allen Institute of Artificial Intelligence (AI2)
  • Teaching #1: Neural networks (neurons and learning)
    • Activity #1: Try neural network playground
  • Teaching #2: Models for temporal data (BiLSTM, Transformers, etc)
    • Activity #2: Citation worthiness prediction
  • Teaching #3: Models for image analysis (CNN, ResNet)
    • Activity #3: image analysis, misleading graphs
  • Teaching #4: Bias in AI (Lizhen's presentation)
    • Activity #4: Example of Bias in AI

Day 5: Causal inference

  • Presentation by Jianxuan Liu (Syracuse University)
  • Teaching #1: From correlation to causation - intuitive example
    • Activity #1: Try discovering whether there is causality
  • Teaching #2: Methods for causal inference - theory of propensity score matching
    • Activity #2: Simple example using PSM with logistic regression
  • Teaching #3: Difference in difference, regression discontinuity, matching
    • Activity #3: Example from the literature (Aaron Clauset, Dashun's paper)
  • Teaching #4: Machine learning perspective (do-calculus), DAGs
    • Activity #4: Backdoor, transportability, etc