
A curated list of awesome resources, papers, datasets, and tools related to Genomic LLMs. This repository aims to provide a comprehensive collection of materials to facilitate research, learning, and development in the field of Genomic LLMs.


✨✨ A curated list of awesome resources, papers, datasets, and tools related to Genomic LLMs. This repository aims to provide a comprehensive collection of materials to facilitate research, learning, and development in the field of Genomic LLMs.

Awesome PRs Welcome License: MIT

Image Source: To Transformers and Beyond: Large Language Models for the Genome

Table of Contents
  1. Papers
  2. Models
  3. datasets


  • HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution

    • Eric Nguyen, Michael Poli, Marjan Faizi, Armin Thomas, Callum Birch-Sykes, Michael Wornow, Aman Patel, Clayton Rabideau, Stefano Massaroli, Yoshua Bengio, Stefano Ermon, Stephen A. Baccus, Chris Ré
    • [Paper]
  • To Transformers and Beyond: Large Language Models for the Genome

    • Micaela E. Consens, Cameron Dufault, Michael Wainberg, Duncan Forster, Mehran Karimzadeh, Hani Goodarzi, Fabian J. Theis, Alan Moses, Bo Wang
    • [Paper]
  • GENA-LM: A Family of Open-Source Foundational DNA Language Models for Long Sequences

    • Veniamin Fishman, Yuri Kuratov, Maxim Petrov, Aleksei Shmelev, Denis Shepelin, Nikolay Chekanov, Olga Kardymon, Mikhail Burtsev
    • [Paper]
  • Sequence modeling and design from molecular to genome scale with Evo

    • Eric Nguyen, Michael Poli, Matthew G. Durrant, Armin W. Thomas, Brian Kang, Jeremy Sullivan, Madelena Y. Ng, Ashley Lewis, Aman Patel, Aaron Lou, Stefano Ermon, Stephen A. Baccus, Tina Hernandez-Boussard, Christopher Ré, Patrick D. Hsu, Brian L. Hie
    • [Paper]