Pinned Repositories
3DInfomax
Making self-supervised learning work on molecules by using their 3D geometry to pre-train GNNs. Implemented in DGL and Pytorch Geometric.
4CAC
adapter-transformers
Huggingface Transformers + Adapters = ❤️
af2complex
Predicting direct protein-protein interactions with AlphaFold deep learning neural network models.
alphafill
AlphaFill is an algorithm based on sequence and structure similarity that “transplants” missing compounds to the AlphaFold models. By adding the molecular context to the protein structures, the models can be more easily appreciated in terms of function and structure integrity.
AlphaZeroFromScratch
Ankh
Ankh: Optimized Protein Language Model
Assembly-Dereplicator
A tool for removing redundant genomes from a set of assemblies
awemags
aweMAGs: a fully automated workflow for eukariotic MAGs
awesome-ai-bioinformatics
A curated list of awesome AI and Bioinformatics.
darrengao628's Repositories
darrengao628/AlphaZeroFromScratch
darrengao628/Ankh
Ankh: Optimized Protein Language Model
darrengao628/Assembly-Dereplicator
A tool for removing redundant genomes from a set of assemblies
darrengao628/bgcflow-0.7
Snakemake workflow to systematically analyze BGCs and pangenomes of large number genomes
darrengao628/bidirectional_streaming_ai_voice
Python scripts to handle a two way voice conversation with Anthropic Claude, using ElevenLabs, Faster-Whisper, and Pygame.
darrengao628/CarsiDock
darrengao628/chroma
A generative model for programmable protein design
darrengao628/deept2
DeepT2 utilizes deep learning techniques to identify type II polyketide (T2PK) synthases KSβ and their corresponding T2PK product within bacterial genomes. The method leverages ESM2 to transform KSβ sequences into embeddings, which are employed to train two separate classifiers using multi-layer perceptron for both KSβ and T2PKs classification.
darrengao628/DynamicBind
repo for DynamicBind: Predicting ligand-specific protein-ligand complex structure with a deep equivariant generative model
darrengao628/efficient-evolution
Efficient evolution from protein language models
darrengao628/EndophyteGenomes
Scripts for Hill et al. (2023) doi:10.1093/gbe/evad038 🌿
darrengao628/esm-s
Structure-Informed Protein Language Model
darrengao628/Eukfinder
Eukfinder: A pipeline to retrieve microbial eukaryote genomes from metagenomic sequencing data
darrengao628/evodiff
Generation of protein sequences and evolutionary alignments via discrete diffusion models
darrengao628/EvoPlay
darrengao628/funbgcex
Fungal Biosynthetic Gene Cluster eXtractor
darrengao628/fungiflow
Reproducible Python workflow for identifying biosynthetic gene clusters from fungal sequence data.
darrengao628/FusariumLifestyles
Scripts for Hill et al. (2022) doi:10.1093/molbev/msac085
darrengao628/gpt-engineer
Specify what you want it to build, the AI asks for clarification, and then builds it.
darrengao628/hmmer
HMMER: biological sequence analysis using profile HMMs
darrengao628/Meeko
Interfacing RDKit and AutoDock
darrengao628/MosAIC_BGC
Analysis of biosynthetic gene cluster diversity in mosquito-associated genomes
darrengao628/MultiPPIMI
A deep learning framework for predicting interactions between protein-protein interaction targets and modulators
darrengao628/papers-for-molecular-design-using-DL
List of molecular design using Generative AI and Deep Learning
darrengao628/papers_for_protein_design_using_DL
List of papers about Proteins Design using Deep Learning
darrengao628/pepmlm
Target Sequence-Conditioned Generation of Peptide Binders via Masked Language Modeling
darrengao628/protpardelle
darrengao628/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
darrengao628/UniDL4BioPep
Benchmark datasets and original codes
darrengao628/wlabkit
🧬 This is a toolkit to handle bio-data from WeiBin Lab