Geometric Deep Learning for Protein Structure Data with PyTorch Lightning

Link to the google colab search for github repos: https://colab.research.google.com/github/ with the repository: https://github.com/PickyBinders/geometric-learning-protein-structures-course

Objectives

Develop a code-base for exploring, training and evaluating graph deep learning models using protein structures as input for a residue-level prediction task.

  • Learn how to featurize protein structures as graphs using Graphein
  • Understand the data loading and processing pipeline for graph datasets using PyTorch Geometric
  • Learn how to implement graph neural networks using PyTorch Geometric
  • Understand the typical deep learning training and evaluation loops using PyTorch Lightning

Task and Dataset

  • Given an input protein chain, predict for each residue whether or not it belongs to a protein-protein interface.
  • The dataset (in dataset.txt) is a subset of the MaSIF-site dataset.
  • Each line is a PDB ID and a chain. We'll use these to extract residues at the interface with other chains and label them as positive examples. All other residues are negative examples.