2023-Big-Data-Driven-Artificial-Intelligence

This is the source code and materials for Big Data Driven Artificial Intelligence course in BNU, 2023 Spring.

This course comprehensively introduces the latest developments in Big Data Driven Artificial Intelligence, including but not limited to neural networks, deep learning, reinforcement learning, causal inference, generative models, language models, and AI for scientific discovery.

Outline

Lecture-01: Introduction to Big Data and Artificial Intelligence;

  • Providing an overview of the history and different schools of Artificial Intelligence.
  • Covering the latest advancements in big data driven AI technologies.
  • Illustrating real-world applications such as ChatGPT and protein folding prediction.
  • References
     - Machine intelligence, Nature 521, 435 (28 May 2015).  |  Paper  |
     - Prediction and its limits, SCIENCE, 3 Feb 2017, Vol 355, Issue 6324 pp. 468-469.  |  Paper  |
     - AI TRANSFORMS SCIENCE, SCIENCE, VOLUME 357, ISSUE 6346, 7 JUL 2017.  |  Paper  |
     -《皇帝的新脑》, Roger Penrose;
     -《人工智能的未来》, Jeff Hawkins;
     -《为什么:关于因果关系的新科学》, 朱迪亚·珀尔 / 达纳·麦肯齐;

Lecture-02: Automatic Differentiation and PyTorch Programming;

  • Introducing automatic differentiation technique and its application scenarios.
  • Introducing the PyTorch automatic differentiation programming platform.
  • Providing an example of using PyTorch.
  • References
     - Automatic Differentiation in Machine Learning: a Survey.  |  Paper |  Code  |
     - Gumbel-softmax-based Optimization: A Simple General Framework for Optimization Problems on Graphs.  |  Paper  |  Code  |
     - Categorical Reparameterization with Gumbel-Softmax.  |  Paper  |  Code  |

Lecture-03: Fundamentals of Machine Learning;

  • What is machine learning and what are its simple classifications?
  • What are the basic steps of machine learning?
  • Performance evaluation and common issues in machine learning.
  • Introduction to simple feedforward neural networks and backpropagation algorithm.
  • References
     - A high-bias, low-variance introduction to Machine Learning for physicists.  |  Paper |  Code1  |  Code2  |

Lecture-04: Common Neural Network Architectures;

  • Basic and common neural network architectures and programming practices such as feedforward neural networks, convolutional neural networks, and recurrent neural networks.
  • Classification problems and practices in image processing and natural language processing.
  • Fundamental methods of data processing.

Lecture-05: Theory of Representation Learning;

  • Representation learning theory.
  • Representation learning and transfer learning.
  • Pre-training and transfer learning.
  • Examples of transfer learning in image tasks.
  • Introduction to word embedding techniques and their applications.
  • References
     - Learning Word Representations by Jointly Modeling Syntagmatic and Paradigmatic Relations.  |  Paper |  Code  |
     - Efficient Estimation of Word Representations in Vector Space.  |  Paper  |  Code  |
     - Distributed Representations of Words and Phrases and their Compositionality.  |  Paper  |  Code  |
     - The Geometry of Culture: Analyzing Meaning through Word Embeddings.  |  Paper  |  Code  |
     - Semantics derived automatically from language corpora contain human-like biases.  |  Paper  |  Code  |
     - Word embeddings quantify 100 years of gender and ethnic stereotypes.  |  Paper  |  Code  |
     - Combining satellite imagery and machine learning to predict poverty.  |  Paper  |  Code  |  Website  |
     - Fighting poverty with data.  |  Paper  |

Lecture-06: From Deep Neural Networks to Neural ODE;

  • Numerical algorithms for solving ordinary differential equations.
  • Residual networks.
  • Principles of Neural ODE.
  • Application examples.
  • Optimal control and adjoint algorithm.
  • References
     - Theory and Applications of AlexNet Convolutional Neural Network.  |  Link |
     - Very Deep Convolutional Networks for Large-Scale Image Recognition.  |  Paper  |  Code  |
     - FractalNet: Ultra-Deep Neural Networks without Residuals.  |  Paper  |  Code  |
     - Deep Residual Learning for Image Recognition.  |  Paper  |  Code  |
     - Identity Mappings in Deep Residual Networks.  |  Paper  |  Code  |
     - Neural Ordinary Differential Equations.  |  Paper  |  Code  |
     - An empirical study of neural ordinal differential equations.  |  Paper  |  Code1  |  Code2  |  Poster  |
     - Deep Multi-Output Forecasting: Learning to Accurately Predict Blood Glucose Trajectories.  |  Paper  |  Code  |

Lecture-07: Overview of Generative Models;

  • The difference between generative models and predictive models.
  • Classification of generative models.
  • Introduction to generative models, including GANs, VAEs, Normalizing Flow, and Diffusion Model.
  • References I
     - 3D Image Generation with Diffusion Models.  |  Link1  |  Link2  |
     - Generative chemistry: drug discovery with deep learning generative models.  |  Paper  |
     - Human-instructed Deep Hierarchical Generative Learning for Automated Urban Planning.  |  Paper  |  Website  |
     - FractalNet: Ultra-Deep Neural Networks without Residuals.  |  Paper  |  Code  |
     - Generative Models in Deep Learning. In: Synthetic Data for Deep Learning.  |  Paper  |
     - Generative Adversarial Nets.  |  Paper  |  Code  |
     - An overview of gradient descent optimization algorithms.  |  Paper  |  Code  |
     - Conditional Generative Adversarial Nets.  |  Paper  |  Code  |
     - Generative Adversarial Text to Image Synthesis.  |  Paper  |  Code  |
     - Image-to-Image Translation with Conditional Adversarial Networks.  |  Paper  |  Code  |
     - Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks.  |  Paper  |  Code1  |  Code2  |
  • References II
     - What are Diffusion Models?.  |  Link  |
     - From Autoencoder to Beta-VAE.  |  Link  |
     - UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows.  |  Link  |
     - Density estimation using Real NVP.  |  Paper  |  Code  |
     - Deep Unsupervised Learning using Nonequilibrium Thermodynamics.  |  Paper  |  Code  |
     - Generative Modeling by Estimating Gradients of the Data Distribution.  |  Paper  |  Code  |  Link  |
     - Denoising Diffusion Probabilistic Models.  |  Paper  |  Code  |
     - U-Net: Convolutional Networks for Biomedical Image Segmentation.  |  Paper  |  Code  |
     - Learning Transferable Visual Models From Natural Language Supervision.  |  Paper  |  Code  |
     - Hierarchical Text-Conditional Image Generation with CLIP Latents.  |  Paper  |  Code  |
     - A Survey on Generative Diffusion Model.  |  Paper  |  Code  |
     - Diffusion Models: A Comprehensive Survey of Methods and Applications.  |  Paper  |  Code  |

Lecture-08: From Transformer to ChatGPT;

  • Attention mechanism.
  • Self-attention mechanism and network structure learning.
  • Introduction to Transformer architecture.
  • Applications of Transformer.
  • Self-supervised learning mechanism based on language models.
  • Introduction to architectures such as BERT, GPT-3, and ChatGPT.
  • References
     - Sparks of Artificial General Intelligence: Early experiments with GPT-4.  |  Paper  |
     - Learning Word Representations by Jointly Modeling Syntagmatic and Paradigmatic Relations.  |  Paper  |  Code  |
     - Convolutional Sequence to Sequence Learning.  |  Paper  |  Code  |
     - Attention Is All You Need.  |  Paper  |  Code  |
     - Neural Phrase-based Machine Translation.  |  Paper  |  Code  |
     - 从word2vec开始,说下GPT庞大的家族系谱.  |  Link  |
     - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.  |  Paper  |  Code  |
     - Improving Language Understanding by Generative Pre-Training.  |  Paper  |  Code  |
     - Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks.  |  Paper  |  Code  |
     - 人工反馈的强化学习.  |  Link  |
     - Scaling Laws for Neural Language Models.  |  Paper  |
     - 137 emergent abilities of large language models.  |  Link  |
     - A Survey on In-context Learning.  |  Paper  |  Code  |
     - Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers.  |  Paper  |  Code  |
     - Prompt Engineering: 循循善诱.  |  Link  |
     - 通向 ChatGPT 及 AGI 的未解之谜.  |  Link  |
     - ChatGPT平替工具介绍.  |  Link  |

Lecture-09: Graph Neural Networks;

  • Graph and Network.
  • Basic principles of Graph Neural Networks.
  • Basic applications of Graph Neural Networks.
  • Node classification.
  • Data-driven modeling of complex systems based on Graph Neural Networks.
  • References I
     - A Comprehensive Survey on Graph Neural Networks.  |  Paper  |  Code  |
     - Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges.  |  Paper  |  Code  |
     - DeepWalk: Online Learning of Social Representations.  |  Paper  |  Code  |
     - node2vec: Scalable Feature Learning for Networks.  |  Paper  |  Code  |
     - Complex Network Classification with Convolutional Neural Network.  |  Paper  |  Code  |
     - struc2vec: Learning Node Representations from Structural Identity.  |  Paper  |  Code  |
     - From Node Embedding To Community Embedding : A Hyperbolic Approach.  |  Paper  |  Code  |
     - Knowledge graph embedding by translating on hyperplanes.  |  Paper  |  Code  |
     - Deep Sets.  |  Paper  |  Code  |
     - Deep Sets.  |  Paper  |  Code  |
     - GRAPH CONVOLUTIONAL NETWORKS.  |  Link  |
     - Graph Attention Networks.  |  Paper  |  Code  |
     - The Emerging Field of Signal Processing on Graphs: Extending High-Dimensional Data Analysis to Networks and Other Irregular Domains.  |  Paper  |  Code  |
     - Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering.  |  Paper  |  Code  |
  • References II
     - Variational Graph Auto-Encoders.  |  Paper  |  Code  |
     - VGAE(Variational graph auto-encoders)论文详解.  |  Link  |
     - Learning Universal Network Representation via Link Prediction by Graph Convolutional Neural Network.  |  Paper  |  Code  |
     - Discovering latent node Information by graph attention network.  |  Paper  |  Code  |
     - Graph Attention Networks.  |  Paper  |  Code  |
     - Network Completion: Beyond Matrix Completion.  |  Paper  |
     - Kronecker graphs: An Approach to Modeling Network.  |  Paper  |
     - The Network Completion Problem: Inferring Missing Nodes and Edges in Networks.  |  Paper  |
     - Completing Networks by Learning Local Connection Patterns.  |  Paper  |  Code  |
     - How Powerful are Graph Neural Networks?.  |  Paper  |  Code  |
     - Generative Graph Convolutional Network for Growing Graphs.  |  Paper  |  Code  |
     - DeepNC: Deep Generative Network Completion.  |  Paper  |  Code  |
     - A Systematic Survey on Deep Generative Models for Graph Generation.  |  Paper  |  Code  |
     - GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models.  |  Paper  |  Code  |
     - A Survey on Deep Graph Generation: Methods and Applications.  |  Paper  |  Code  |
     - Graph Normalizing Flows.  |  Paper  |  Code  |
     - UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows.  |  Link  |
     - Density estimation using Real NVP.  |  Paper  |  Code  |
     - Generative Diffusion Models on Graphs: Methods and Applications.  |  Paper  |  Code  |

Lecture-10: Data-Driven Modeling of Complex Systems;

  • Introduction to complex systems.
  • Modeling methods for complex systems.
  • Data-driven modeling methods for complex systems.
  • Complete closed loop system including decision-making and feedback.
  • Learning causal relationships.
  • Reinforcement learning framework based on world models.
  • References
     - Investigating time, strength, and duration of measures in controlling the spread of COVID-19 using a networked meta-population model.  |  Paper  |
     - Takens's theorem.  |  Link  |
     - Deep Multi-Output Forecasting: Learning to Accurately Predict Blood Glucose Trajectories.  |  Paper  |  Code  |
     - Reservoir computing.  |  Link  |
     - Model-Free Prediction of Large Spatiotemporally Chaotic Systems from Data: A Reservoir Computing Approach.  |  Paper  |
     - PM2.5-GNN: A Domain Knowledge Enhanced Graph Neural Network For PM2.5 Forecasting.  |  Paper  |  Code  |
     - Universal framework for reconstructing complex networks and node dynamics from discrete or continuous dynamics data.  |  Paper  |  Code  |
     - A General Deep Learning Framework for Network Reconstruction and Dynamics Learning.  |  Paper  |  Code  |
     - Categorical Reparameterization with Gumbel-Softmax.  |  Paper  |  Code  |
     - Model-free inference of direct network interactions from nonlinear collective dynamics.  |  Paper  |  Code  |
     - Neural Relational Inference for Interacting Systems.  |  Paper  |  Code  |
     - Discovering latent node Information by graph attention network.  |  Paper  |  Code  |
     - HighAir: A Hierarchical Graph Neural Network-Based Air Quality Forecasting Method.  |  Paper  |  Code  |
     - Multi-Scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition.  |  Paper  |  Code  |
     - Graph U-Nets..  |  Paper  |  Code  |
     - U-Net: Convolutional Networks for Biomedical Image Segmentation.  |  Paper  |  Code  |
     - Deep Learning for Prediction of the Air Quality Response to Emission Changes.  |  Paper  |
     - Estimates and 25-year trends of the global burden of disease attributable to ambient air pollution: an analysis of data from the Global Burden of Diseases Study 2015.  |  Paper  |
     - Learning to Simulate Complex Physics with Graph Networks.  |  Paper  |  Code  |

Lecture-11: Causal Machine Learning;

  • Causation and Correlation.
  • Introduction to Causal Inference.
  • Introduction to Causal Discovery.
  • Causal Representation Learning.
  • References I
     - Judea Pearl: The Book of Why, 2019.  |  Link  |
     - Towards Causal Representation Learning.  |  Paper  |
     - Judea Pearl: Causality, 2009  |  Link  |
     - 总结梳理结构因果模型并剖析其与潜在结果模型的关系  |  Link  |
  • References II
     - Causality and model abstraction.  |  Paper  |
     - Causal Machine Learning: A Survey and Open Problems.  |  Paper  |
     - 漫谈因果表征学习  |  Link  |
     - Elements of Causal Inference - Foundations and Learning Algorithms.  |  Link  |
     - 因果发现介绍及 Tetrad 工具包使用教程  |  Link  |
     - Introduction to Causal Inference.  |  Link  |
     - Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm.  |  Paper  |
     - Review of Causal Discovery Methods Based on Graphical Models.  |  Paper  |
     - Universal framework for reconstructing complex networks and node dynamics from discrete or continuous dynamics data.  |  Paper  |  Code  |
     - Causal feature learning: an overview.  |  Paper  |
     - A Tutorial on Independent Component Analysis.  |  Paper  |  Code  |
     - Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA.  |  Paper  |  Code  |
     - Causal Discovery with General Non-Linear Relationships Using Non-Linear ICA.  |  Paper  |
     - Variational Autoencoders and Nonlinear ICA: A Unifying Framework.  |  Paper  |  Code  |
     - 稳定学习 Stable Learning.  |  Link  |
     - A theory of independent mechanisms for extrapolation in generative models.  |  Paper  |
     - Types and Forms of Emergence.  |  Paper  |  Link  |
     - Quantifying causal emergence shows that macro can beat micro.  |  Paper  |
     - When the map is better than the territory.  |  Paper  |
     - Neural Information Squeezer for Causal Emergence.  |  Paper  |  Code  |

Lecture-12: Reinforcement Learning;

  • Basic framework of reinforcement learning.
  • Classification of reinforcement learning.
  • Q-learning algorithm.
  • Deep reinforcement learning.
  • Reinforcement learning algorithms based on the World Model.
  • Causality and reinforcement learning.
  • Reinforcement learning and control/decision-making.
  • References
     - Integrated architecture for learning, planning, and reacting based on approximating dynamic programming.  |  Paper  |
     - Kinds of RL Algorithms.  |  Link  |
     - Mastering the game of Go with deep neural networks and tree search.  |  Paper  |
     - 什么是动态规划(Dynamic Programming)?动态规划的意义是什么?.  |  Link  |
     - Human-level control through deep reinforcement learning.  |  Paper  |  Code  |
     - 因果强化学习 | 因果科学与Causal AI读书会.  |  Link  |
     - Off-Policy Evaluation in Partially Observable Environments.  |  Paper  |
     - Causal Confusion in Imitation Learning.  |  Paper  |  Code  |
     - Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search.  |  Paper  |
     - Causal Discovery with Reinforcement Learning.  |  Paper  |  Code  |
     - Causal Machine Learning: A Survey and Open Problems.  |  Paper  |
     - Towards Causal Representation Learning.  |  Paper  |
     - The Book of Why.  |  Link  |
     - World Models.  |  Paper  |  Code  |
     - Dream to Control: Learning Behaviors by Latent Imagination.  |  Paper  |  Code  |
     - Mastering Diverse Domains through World Models.  |  Paper  |  Code  |
     - Separating the World and Ego Models for Self-Driving.  |  Paper  |  Code  |
     - 《思考,快与慢》.  |  Link  |
     - Goedel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements.  |  Paper  |

Sources