/openbiochem

Open source resources for biology, chemistry, and computation.

Primary LanguageJavaScript

open-biochem

logo

Open resources for biology, chemistry, and computation.

How this repository is organized

├── assets/     # Images and other media.
├── book/       # Draft of comprehensive, open source textbook on biology, chemistry, and computation.
├── flashcards/ # Structure YAML files for things worth memorizing.
├── ideas.md    # Ideas for future projects.
├── notes/      # Notes to be converted into book sections, or blog posts.
├── papers/     # Papers to read.
├── README.md   # this document
├── reference/  # Reference sheets and other structured data.
├── todo.md     # Websites, papers, and other resources to explore.
├── tools.md    # A list of tools used in research and education.
└── usnco/      # Resources related to the United States National Chemistry Olympiad.

Book

📝 Incomplete drafts

  • Protein folding
  • Genome browsers
  • Data sources
  • Immune system, Human leukocyte antigen (Wikipedia)

📜 Papers

🌺 Biology

🧬 Sequence analysis

  • Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. (Rives et al., 2019) (biorxiv)
  • Computational methods for single-cell RNA sequencing. (Hie et al., Annual Reviews, 2020) (http, ipfs)

💻 Computational biology

  • Deep generative model embedding of single-cell RNA-Seq profiles on hyperspheres and hyperbolic spaces. (Ding and Regev, 2019) (biorxiv, ipfs, doi)

🧠 Knowledge bases and datasets

  • DrugBank: a comprehensive resource for in silico drug discovery and exploration. (Wishart et al., Nucelic Acids Res, 2006) (http)

Cell biology

  • Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. (Macosko et al., Cell, 2015) (ipfs, doi)
  • A cancer cell program promotes T-cell exclusion and resistance to checkpoint blockade. (Jerby-Arnon et al., Cell, 2018) (ipfs, doi)
  • Intra- and inter- cellular rewiring of the human colon during ulcerative colitis. (Smillie et al., Cell, 2019) (ipfs, doi)
  • Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. (Tirosh et al., Science, 2016) (ipfs, doi)

Gene editing

  • Programmable base editing of A/T to G/C in genomic DNA without DNA cleavage. (Gaudelli et al., Nature, 2017) (doi, ipfs)
  • Search-and-replace genome editing without double-strand breaks or donor DNA. (Anzalone et al., Nature, 2019) (doi, ipfs)

🧪 Chemistry

🌳 Computational chemistry

  • Discovering chemistry with an ab initio nanoreactor. (Wang, Titov, McGibbon et al., Nature Chem, 2014) (doi, ipfs)
  • Cloud-based simulations on Google Exacycle reveal ligand modulation of GPCR activation pathways. (Kohlhoff, Shukla, Lawrenz, et al., Nature Chem, 2014) (ipfs)
  • Exploring chemical space using natural language processing. (Ozturk et al.) (arxiv)

💊 Therapeutic design

  • Small Molecule Targets TMED9 and Promotes Lysosomal Degradation to Reverse Proteinpathy. (Dvela-Levitt et al., Cell, 2019) (ipfs)
    • keywords: Human kidney organoids, MUC1
  • Potential 2019-nCoV 3C-like protease inhibitors designed using generative deep learning approaches. (Zhavoronkov et al., 2020) (ipfs)

Tools

  • Integrative Genomics Viewer (IGV). (docs)
  • IGV Jupyter widgets. (github)
  • Bandage. (docs)
  • NCBI Toolbox. (docs)
  • Vega Visualization Grammar for Graphing. (docs)
  • Binding Database. (website)
  • BioLiP; Ligand-protein binding database. (website)

Questions

  • Linked de Bruijn graphs (LdBG)
    • What information do they encode? Why are they useful for bioinformatics?
  • Why is the number of distinct de Bruijn sequences B(k, n) equal to (k!)^(k^(n-1)) / (k^n)?
  • How can we continue to extend Transformer and related NLP architectures to biological sequences? Are there even more pre-training tasks we can use?
  • Topological determinants of protein folding
    • What is the right mathematical model to use to develop theory about protein folding?

Books

Chemistry

Computational chemistry

  • Cramer, Essentials of Computational Chemistry: Theories and Models.
  • Koch and Holthausen, A Chemist's Guide to Density Functional Theory.

Biology


📦 Potential projects

  • Protein visualizations and explanations. An educational website that displays many of the commonly encountered proteins in biology, and contains a guided explanation (e.g. waypoints and arrows) of how the protein functions.

🎞️ Videos

  • Broad@15 Talk Series: The March Toward Cancer Precision Medicine. (http)

🍎 Courses

  • MIT 20.380: Biological Engineering Design.
  • MIT 10.637: Quantum Chemical Simulation. (http)
  • Computational genomics course. (youtube)

Blogs and tutorials

Bioinformatics

  • Canadian Bioinformatics Workshops. (http)

Cheminformatics

  • Practical cheminformatics. (http)
  • Molecular modeling basics. (http

📁 Other Repositories

  • Awesome Bioinformatics. (github)
  • Biotools. (github)
  • Repository of medical data. (github)