kduxin

Waseda UniversityTokyo

kduxin's Stars

labmlai/annotated_deep_learning_paper_implementations
🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
Language:Python56.6k 457 1325.8k
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python37.6k 377 3186k
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python20.3k 307 1.4k2.6k
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Language:Python15.1k 262 2142.6k
BlinkDL/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
Language:Python12.7k 132 218868
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
Language:Python10.7k 163 7882.4k
microsoft/DeepSpeedExamples
Example models using DeepSpeed
Language:Python6.1k 75 5381k
lucidrains/x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
Language:Python4.8k 56 230418
Mereithhh/vanblog
一款简洁实用优雅的个人博客系统
Language:TypeScript2.9k 21 387405
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Language:Python1.9k 24 182345
rustwasm/book
The Rust and WebAssembly Book
Language:Handlebars1.7k 39 163211
ml-jku/hopfield-layers
Hopfield Networks is All You Need
Language:Python1.7k 44 21193
google-deepmind/optax
Optax is a gradient processing and optimization library for JAX.
Language:Python1.7k 35 259194
HazyResearch/safari
Convolutions for Sequence Modeling
Language:Assembly869 34 3971
causaltext/causal-text-papers
Curated research at the intersection of causal inference and natural language processing.
786 39 297
second-state/wasm-learning
Building Rust functions for Node.js to take advantage of Rust's performance, WebAssembly's security and portability, and JavaScript's ease-of-use. Demo code and recipes.
Language:Rust487 15 14100
IntelLabs/academic-budget-bert
Repository containing code for "How to Train BERT with an Academic Budget" paper
Language:Python309 16 2247
mattilyra/LSH
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
Language:Python282 10 080
claudiashi57/dragonnet
Language:Python250 8 1453
YuanchenBei/Awesome-Pretraining-for-Graph-Neural-Networks
A curated list of papers on pre-training for graph neural networks (Pre-train4GNN).
172 4 112
amaiya/causalnlp
CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.
Language:Jupyter Notebook140 7 211
lee-ny/teaching_arithmetic
Language:Jupyter Notebook70 3 319
vveitch/causal-text-embeddings-tf2
Tensorflow 2 implementation of Causal-BERT
Language:Python70 3 219
xxxiaol/GCI
Code for Everything Has a Cause: Leveraging Causal Inference in Legal Text Analysis (NAACL 2021 oral paper)
Language:Python66 3 1111
OpenNLPLab/Transnormer
[EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer
Language:Python55 4 35
vveitch/causal-network-embeddings
Software and pre-processed data for "Using Embeddings to Correct for Unobserved Confounding in Networks"
Language:Python55 3 09
HKUDS/GraphPro
[WWW'2024] "GraphPro: Graph Pre-training and Prompt Learning for Recommendation"
Language:Python53 2 2
idiap/hypermixing
PyTorch implementation for HyperMixing, a linear-time token-mixing technique used in HyperMixer architecture
Language:Python21 4 12
yingyichen-cyy/PrimalAttention
(NeurIPS 2023) PyTorch implementation of "Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation"
Language:Python18 2 10
ncsulsj/Causal_LLM
Language:Python15 3 01

kduxin

kduxin's Stars

labmlai/annotated_deep_learning_paper_implementations

karpathy/nanoGPT

microsoft/unilm

openai/evals

BlinkDL/RWKV-LM

NVIDIA/Megatron-LM

microsoft/DeepSpeedExamples

lucidrains/x-transformers

Mereithhh/vanblog

microsoft/Megatron-DeepSpeed

rustwasm/book

ml-jku/hopfield-layers

google-deepmind/optax

HazyResearch/safari

causaltext/causal-text-papers

second-state/wasm-learning

IntelLabs/academic-budget-bert

mattilyra/LSH

claudiashi57/dragonnet

YuanchenBei/Awesome-Pretraining-for-Graph-Neural-Networks

amaiya/causalnlp

lee-ny/teaching_arithmetic

vveitch/causal-text-embeddings-tf2

xxxiaol/GCI

OpenNLPLab/Transnormer

vveitch/causal-network-embeddings

HKUDS/GraphPro

idiap/hypermixing

yingyichen-cyy/PrimalAttention

ncsulsj/Causal_LLM