ai4science

There are 76 repositories under ai4science topic.

  • opendatalab/MinerU

    A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

    Language:Python22k1157761.6k
  • microsoft/Graphormer

    Graphormer is a general-purpose deep learning backbone for molecular modeling.

    Language:Python2.2k31156341
  • yuzhimanhua/Awesome-Scientific-Language-Models

    A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery (EMNLP'24)

  • JuDFTteam/best-of-atomistic-machine-learning

    🏆 A ranked list of awesome atomistic machine learning projects ⚛️🧬💎.

  • PaddlePaddle/PaddleScience

    PaddleScience is SDK and library for developing AI-driven scientific computing applications based on PaddlePaddle.

    Language:Python3022069184
  • shengchaochen82/Awesome-Foundation-Models-for-Weather-and-Climate

    A comprehesive survey about foundation models for weather and cliamte data understanding.

  • davendw49/k2

    Code and datasets for paper "K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization" in WSDM-2024

    Language:Python17861316
  • ChemFoundationModels/ChemLLMBench

    What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks

    Language:Jupyter Notebook133486
  • patrick-tssn/Awesome-Colorful-LLM

    Recent advancements propelled by large language models (LLMs), encompassing an array of domains including Vision, Audio, Agent, Robotics, Fundamental Sciences such as Mathematics, and Ominous.

  • chao1224/Geom3D

    Geom3D: Geometric Modeling on 3D Structures, NeurIPS 2023

    Language:Python1162313
  • AlexDuvalinho/geometric-gnns

    List of Geometric GNNs for 3D atomic systems

  • OSU-NLP-Group/LLM4Chem

    Official code repo for the paper "LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset"

    Language:Python728810
  • chiang-yuan/llamp

    A web app and Python API for multi-modal RAG framework to ground LLMs on high-fidelity materials informatics. An agentic materials scientist powered by @materialsproject, @langchain-ai, and @openai

    Language:Jupyter Notebook7112411
  • deep-symbolic-mathematics/TPSR

    [NeurIPS 2023] This is the official code for the paper "TPSR: Transformer-based Planning for Symbolic Regression"

    Language:Python585513
  • hanjq17/GMN

    [ICLR 2022] The implementation for the paper "Equivariant Graph Mechanics Networks with Constraints".

    Language:Python58157
  • zjunlp/NLP4SciencePapers

    Must-read papers on NLP for science.

  • PKU-YuanGroup/TaxDiff

    The official code for "TaxDiff: Taxonomic-Guided Diffusion Model for Protein Sequence Generation"

    Language:Python57538
  • deep-symbolic-mathematics/Multimodal-Math-Pretraining

    [ICLR 2024 Spotlight] This is the official code for the paper "SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training"

    Language:Python49326
  • deep-symbolic-mathematics/LLM-SR

    This is the official repo for the paper "LLM-SR" on Scientific Equation Discovery and Symbolic Regression with Large Language Models

    Language:Python47627
  • HongxinXiang/ImageMol

    ImageMol is a molecular image-based pre-training deep learning framework for computational drug discovery.

    Language:Python4722026
  • zjunlp/ChatCell

    ChatCell: Facilitating Single-Cell Analysis with Natural Language

    Language:Python47809
  • OPTML-Group/DeepZero

    [ICLR'24] "DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training" by Aochuan Chen*, Yimeng Zhang*, Jinghan Jia, James Diffenderfer, Jiancheng Liu, Konstantinos Parasyris, Yihua Zhang, Zheng Zhang, Bhavya Kailkhura, Sijia Liu

    Language:Python46136
  • chao1224/ProteinDT

    A Text-guided Protein Design Framework, Nat Mach Intell 2025

    Language:Python45634
  • Emory-Melody/awesome-epidemic-modeling-papers

    [KDD 2024] Papers about deep learning in epidemic modeling.

  • HongxinXiang/awesome-ai-bioinformatics

    A curated list of awesome AI and Bioinformatics.

  • pengxingang/MolDiff

    MolDiff: Addressing the Atom-Bond Inconsistency Problem in 3D Molecule Diffusion Generation

    Language:Python42678
  • garywei944/ChemFlow

    Uncover meaningful structures of latent spaces learned by generative models with flows!

    Language:Python40307
  • haozhg/odmd

    AI4Science: Python/Matlab implementation of online and window dynamic mode decomposition (Online DMD and Window DMD)

    Language:Python373120
  • jiaor17/3D-EMGP

    [AAAI 2023] The implementation for the paper "Energy-Motivated Equivariant Pretraining for 3D Molecular Graphs"

    Language:Python33217
  • EureKaZhu/DiffAffinity

    Predicting mutational effects on protein-protein binding via a side-chain diffusion probabilistic model (NeurIPS 2023 Poster)

    Language:Jupyter Notebook31233
  • HowardLi1984/ECDFormer

    The official code for "Deep peak property learning for efficient chiral molecules ECD spectra prediction"

    Language:Python31200
  • JieZheng-ShanghaiTech/KG4SL

    Synthetic lethality (SL) is a promising gold mine for the discovery of anti-cancer drug targets. KG4SL is the first graph neural network (GNN)-based model that uses knowledge graph for SL prediction.

    Language:Python312212
  • haozhg/oml

    AI4Science: Efficient data-driven Online Model Learning (OML) / system identification and control

    Language:Python30115
  • hanjq17/EGHN

    [NeurIPS 2022] The implementation for the paper "Equivariant Graph Hierarchy-Based Neural Networks".

    Language:Python27313
  • terencetaothucb/TBSI-Sunwoda-Battery-Dataset

    Sunwoda Electronic Co., Ltd, and Tsinghua Berkeley Shenzhen Institute (TBSI) generate the TBSI Sunwoda Battery Dataset. We open-source this dataset to inspire more data-driven novel material verification, battery management research and applications.

    Language:MATLAB27110
  • dzjxzyd/UniDL4BioPep

    webserver

    Language:Jupyter Notebook23215