kduxin's Stars
labmlai/annotated_deep_learning_paper_implementations
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
BlinkDL/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
microsoft/DeepSpeedExamples
Example models using DeepSpeed
lucidrains/x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
Mereithhh/vanblog
一款简洁实用优雅的个人博客系统
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
rustwasm/book
The Rust and WebAssembly Book
ml-jku/hopfield-layers
Hopfield Networks is All You Need
google-deepmind/optax
Optax is a gradient processing and optimization library for JAX.
HazyResearch/safari
Convolutions for Sequence Modeling
causaltext/causal-text-papers
Curated research at the intersection of causal inference and natural language processing.
second-state/wasm-learning
Building Rust functions for Node.js to take advantage of Rust's performance, WebAssembly's security and portability, and JavaScript's ease-of-use. Demo code and recipes.
IntelLabs/academic-budget-bert
Repository containing code for "How to Train BERT with an Academic Budget" paper
mattilyra/LSH
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
claudiashi57/dragonnet
YuanchenBei/Awesome-Pretraining-for-Graph-Neural-Networks
A curated list of papers on pre-training for graph neural networks (Pre-train4GNN).
amaiya/causalnlp
CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.
lee-ny/teaching_arithmetic
vveitch/causal-text-embeddings-tf2
Tensorflow 2 implementation of Causal-BERT
xxxiaol/GCI
Code for Everything Has a Cause: Leveraging Causal Inference in Legal Text Analysis (NAACL 2021 oral paper)
OpenNLPLab/Transnormer
[EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer
vveitch/causal-network-embeddings
Software and pre-processed data for "Using Embeddings to Correct for Unobserved Confounding in Networks"
HKUDS/GraphPro
[WWW'2024] "GraphPro: Graph Pre-training and Prompt Learning for Recommendation"
idiap/hypermixing
PyTorch implementation for HyperMixing, a linear-time token-mixing technique used in HyperMixer architecture
yingyichen-cyy/PrimalAttention
(NeurIPS 2023) PyTorch implementation of "Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation"
ncsulsj/Causal_LLM