Transformer-based Large Language Models in Chemistry

The history of philosophy is the history of forgetting. Problems and ideas once examined fall out of sight and out of mind only to resurface later as novel and new. - R. Jacoby

Encoder-Decoder Models (T5, BART, Molecular Transformer, BLM, etc.)

Multitask Text and Chemistry T5: Unifying Molecular and Textual Representations via Multi-task Language Modelling. [PAPER] [REPO]
MolT5: Translation between Molecules and Natural Language. [PAPER] [REPO]
Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction. [PAPER] [REPO]
- Unassisted Noise-Reduction of Chemical Reactions Data Sets. [PAPER]
- Automated Extraction of Chemical Synthesis Actions from Experimental Procedures. [PAPER]
- Predicting retrosynthetic pathways using a combined linguistic model and hyper-graph exploration strategy. [PAPER]
- Reagent Prediction with a Molecular Transformer Improves Reaction Data Quality. [PAPER] [REPO]
- Leveraging Infrared Spectroscopy for Automated Structure Elucidation. [PAPER]
Uni-Mol: A Universal 3D Molecular Representation Learning Framework. [PAPER] [REPO]
T5 Chem: Unified Deep Learning Model for Multitask Reaction Predictions with Explanation. [PAPER] [REPO]
MolGen: Domain-Agnostic Molecular Generation with Self-feedback. [PAPER] [REPO]
TransformMolecules: Can We Quickly Learn to “Translate” Bioactive Molecules with Transformer Models? [PAPER] [REPO]
A Pre-trained Conditional Transformer for Target-specific De Novo Molecular Generation. [PAPER]
Transformer-CNN: Swiss knife for QSAR modeling and interpretation. [PAPER] [REPO]
SMILES Transformer: Pre-trained Molecular Fingerprint for Low Data Drug Discovery. [PAPER] [REPO]
Chemformer: a pre-trained transformer for computational chemistry. [PAPER] [REPO]
- MolBART. [REPO]
- MegaMolBART. [MODEL]
FragNet: Contrastive Learning-Based Transformer Model for Clustering, Interpreting, Visualizing, and Navigating Chemical Space. [PAPER]
PanGu Drug Model: Learn a Molecule Like a Human. [PAPER]
State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis. [PAPER] [REPO]
Struct2IUPAC: Transformer-based artificial neural networks for the conversion between chemical notations. [PAPER] [REPO]
Transformer-based Approach for Predicting Chemical Compound Structures. [PAPER]
Crystal Transformer: Self-learning neural language model for Generative and Tinkering Design of Materials. [PAPER]
- Material Transformer Generator: Discovery of 2D materials using Transformer Network based Generative Design. [PAPER]

Encoder Only Models (BERT, XLNet, etc.)

Regression Transformer enables concurrent sequence regression and generation for molecular language modelling. [PAPER] [REPO]
MOFormer: Self-Supervised Transformer model for Metal-Organic Framework Property Prediction. [PAPER] [REPO]
RXNFP: Mapping the Space of Chemical Reactions using Attention-Based Neural Networks. [PAPER] [REPO]
KV-PLM: A deep-learning system bridging molecule structure and biomedical text with comprehension comparable to human professionals. [PAPER] [REPO]
ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction. [PAPER] [REPO]
- ChemBERTa-2: Towards Chemical Foundation Models. [PAPER]
MolBERT: Molecular representation learning with language models and domain-relevant auxiliary tasks. [PAPER] [REPO]
Mole-BERT: Rethinking Pre-training Graph Neural Networks for Molecules. [PAPER] [REPO]
MoLFormer: Large-Scale Chemical Language Representations Capture Molecular Structure and Properties. [PAPER] [REPO]
TransPolymer: a Transformer-based language model for polymer property predictions. [PAPER] [REPO]
DeLiCaTe: Chemical transformer compression for accelerating both training and inference of molecular modeling. [PAPER] [REPO]
MatSciBERT: A materials domain language model for text mining and information extraction. [PAPER] [REPO]
SolvBERT for solvation free energy and solubility prediction: a demonstration of an NLP model for predicting the properties of molecular complexes. [PAPER] [REPO]
Transformer Quantum State: A Multi-Purpose Model for Quantum Many-Body Problems. [PAPER] [REPO]
Taiga: Molecule generation using transformers and policy gradient reinforcement learning. [PAPER] [REPO]
SMILES-BERT: Large Scale Unsupervised Pre-Training for Molecular Property Prediction. [PAPER] [REPO]
MM-Deacon: Multilingual Molecular Representation Learning via Contrastive Pre-training. [PAPER]
MEMO: A Multiview Contrastive Learning Approach to Molecular Pretraining. [PAPER]
Molecule Attention Transformer. [PAPER] [REPO]
- SolTranNet: A machine learning tool for fast aqueous solubility prediction. [PAPER] [REPO]
MaterialBERT for natural language processing of materials science texts. [PAPER]
Adaptive Language Model Training for Molecular Design. [PAPER]

Decoder Only Models (GPT, etc.)

MolGPT: Molecular Generation Using a Transformer-Decoder Model. [PAPER] [REPO]
ChemGPT: Neural Scaling of Deep Chemical Models. [PAPER] [REPO]
OptoGPT: A Foundation Model for Inverse Design in Optical Multilayer Thin Film Structures. [PAPER]
SGPT-RL: Optimization of binding affinities in chemical space with generative pretrained transformer and deep reinforcement learning. [PAPER]
MolXPT: Wrapping Molecules with Text for Generative Pre-training. [PAPER]
Material transformers: deep learning language models for generative materials design. [PAPER] [REPO]
XYZTransformer: Language models can generate molecules, materials, and protein binding sites directly in three dimensions as XYZ, CIF, and PDB files. [PAPER]
Galactica: A Large Language Model for Science. [PAPER] [REPO]
cMolGPT: A Conditional Generative Pre-Trained Transformer for Target-Specific De Novo Molecular Generation. [PAPER] [REPO]
PrefixMol: Target- and Chemistry-aware Molecule Design via Prefix Embedding. [PAPER]
LigGPT: Molecular Generation using a Transformer-Decoder Model. [PAPER] [REPO]
X-MOL: large-scale pre-training for molecular understanding and diverse molecular analysis. [PAPER]

Other architectures

GROVER: Self-Supervised Graph Transformer on Large-Scale Molecular Data. [PAPER]
DMP: Dual-view Molecule Pre-training. [PAPER] [REPO]
MICER: a pre-trained encoder–decoder architecture for molecular image captioning. [PAPER] [MODEL]
Fragment-based t-SMILES for de novo molecular generation. [PAPER] [REPO]
DrugGen: Target Specific De Novo Design of Drug Candidate Molecules with Graph Transformer-based Generative Adversarial Networks. [PAPER] [REPO]
Graphormer: Do Transformers Really Perform Badly for Graph Representation? [PAPER] [REPO]
KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular Property Prediction. [PAPER] [REPO]
GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning. [PAPER] [REPO]
rIOP: Testing the Limits of SMILES-based De Novo Molecular Generation with Curriculum and Deep Reinforcement Learning. [PAPER] [REPO]
Discovery of structure-property relations for molecules via hypothesis-driven active learning over the chemical space. [PAPER] [REPO]
REINVENT 2.0 – an AI Tool for De Novo Drug Design. [PAPER] [REPO]
- A Simple Way to Incorporate Target Structural Information in Molecular Generative Models. [PAPER] [REPO]
Improving Chemical Autoencoder Latent Space and Molecular De novo Generation Diversity with Heteroencoders. [PAPER]
CDDD: Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. [PAPER] [REPO]
Unsupervised Representation Learning for Proteochemometric Modeling. [PAPER]
UnCorrupt SMILES: a novel approach to de novo design. [PAPER] [REPO]
Leveraging molecular structure and bioactivity with chemical language models for de novo drug design. [PAPER] [REPO]
STOUT: SMILES to IUPAC names using neural machine translation. [PAPER] [MODEL]

General works of Interest

A Systematic Survey of Chemical Pre-trained Models. [PAPER]
Machine intelligence for chemical reaction space. [PAPER]
Exploring Chemical Space using Natural Language Processing Methodologies for Drug Discovery. [PAPER]
Comparative Study of Deep Generative Models on Chemical Space Coverage. [PAPER]
Explainability Techniques for Chemical Language Models. [PAPER]
Unified 2D and 3D Pre-Training of Molecular Representations. [PAPER]
Exploring chemical space — Generative models and their evaluation. [PAPER]
Difficulty in learning chirality for Transformer fed with SMILES. [PAPER] [REPO]
Molecular language models: RNNs or transformer? [PAPER]
Artificial intelligence in multi-objective drug design. [PAPER]
Evaluating the roughness of structure-property relationships using pretrained molecular representations. [PAPER]
Reconstruction of lossless molecular representations from fingerprints. [PAPER]
Neural Scaling of Deep Chemical Models. [PAPER]
The Druglike molecule pretraining strategy for drug discovery. [PAPER]
Accelerating the design and development of polymeric materials via deep learning: Current status and future challenges. [PAPER]
Materials Transformers Language Models for Generative Materials Design: a benchmark study. [PAPER]
A note on transformer architectures.
Social Amnesia (History did not start in 2017) [BOOK]
Malta – Sweet Magic (1984) [ALBUM]

alxfgh/Large-Language-Models-in-Chemistry

Transformer-based Large Language Models in Chemistry

Encoder-Decoder Models (T5, BART, Molecular Transformer, BLM, etc.)

Encoder Only Models (BERT, XLNet, etc.)

Decoder Only Models (GPT, etc.)

Other architectures

General works of Interest