junesookang's Stars
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
state-spaces/mamba
Mamba SSM architecture
NVIDIA/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
naganandy/graph-based-deep-learning-literature
links to conference publications in graph-based deep learning
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
togethercomputer/MoA
Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models
snap-stanford/ogb
Benchmark datasets, data loaders, and evaluators for graph machine learning
horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
NVIDIA/gdrcopy
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
AmberLJC/LLMSys-PaperList
Large Language Model (LLM) Systems Paper List
forhaoliu/ringattention
Transformers with Arbitrarily Large Context
microsoft/ptgnn
A PyTorch Graph Neural Network Library
feifeibear/long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
microsoft/mscclpp
MSCCL++: A GPU-driven communication stack for scalable AI applications
NVIDIA/gds-nvidia-fs
NVIDIA GPUDirect Storage Driver
InfiniTensor/InfiniTensor
ZaidQureshi/bam
microsoft/DeepGNN
DeepGNN is a framework for training machine learning models on large scale graph data.
microsoft/ark
A GPU-driven system framework for scalable AI applications
hgyhungry/ge-spmm
snu-comparch/InfiniGen
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)
IllinoisGraphBenchmark/IGB-Datasets
Largest realworld open-source graph dataset - Worked done under IBM-Illinois Discovery Accelerator Institute and Amazon Research Awards and in collaboration with NVIDIA Research.
Sys-KU/DeepPlan
Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access (ACM EuroSys '23)
K-Wu/pytorch-direct_dgl
PyTorch-Direct code on top of PyTorch-1.8.0nightly (e152ca5) for Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture (accepted by PVLDB)
jeongminpark417/GIDS
unist-ssl/IIDP
unist-ssl/JABAS
"JABAS: Joint Adaptive Batching and Automatic Scaling for DNN Training on Heterogeneous GPUs" (EuroSys '25)
seijimaekawa/empirical-study-of-GNNs
HPMLL/HP-SpMM
Fast SpMM implementation on GPUs for GNN (IPDPS'23)