JinuJeong

Korea University

JinuJeong's Stars

microsoft/BitNet
Official inference framework for 1-bit LLMs
Language:C++5.1k325
VIA-Research/uPIMulator
Language:C9313
abdullahfsm/PCS
Language:Python4
sarchlab/mgpusim
A highly-flexible GPU simulator for AMD GPUs.
Language:Go8418
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python5.7k447
Feh/nocache
minimize caching effects
Language:C55453
project-baize/baize-chatbot
Let ChatGPT teach your own chatbot in hours with a single GPU!
Language:Python3.2k283
flexflow/FlexFlow
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
Language:C++1.7k224
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.4k957
triton-lang/triton
Development repository for the Triton language and compiler
Language:C++13.1k1.6k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.8k1.3k
chrischoy/MakePytorchPlusPlus
How and why you want to make your pytorch CUDA/CPP extension with a Makefile
Language:Makefile17116
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:Python8.5k607
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python28.8k4.3k
Azrael3000/tmpi
Run a parallel command inside a split tmux window
Language:Shell13538
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.8k891
Raphael-Hao/Abacus
Language:Python378
Sys-KU/AutoTiering
Exploring the Design Space of Page Management for Multi-Tiered Memory Systems (USENIX ATC '21)
Language:C436
Sys-KU/DeepPlan
Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access (ACM EuroSys '23)
Language:C++538
casys-kaist/HUVM
Language:C226
casys-kaist/CoVA
Official code repository for "CoVA: Exploiting Compressed-Domain Analysis to Accelerate Video Analytics [USENIX ATC 22]"
Language:Rust152
neomorphism/neomo
Neomorphism(neumorphism) Design Framework Open Source
Language:CSS445
neoclide/coc.nvim
Nodejs extension host for vim & neovim, load extensions like VSCode and host language servers.
Language:TypeScript24.4k953
khakiee/comments_collector
Collect naver entertain news comments
Language:Python22

JinuJeong

JinuJeong's Stars

microsoft/BitNet

VIA-Research/uPIMulator

abdullahfsm/PCS

sarchlab/mgpusim

sgl-project/sglang

Feh/nocache

project-baize/baize-chatbot

flexflow/FlexFlow

NVIDIA/TensorRT-LLM

triton-lang/triton

Dao-AILab/flash-attention

chrischoy/MakePytorchPlusPlus

facebookresearch/xformers

vllm-project/vllm

Azrael3000/tmpi

NVIDIA/FasterTransformer

Raphael-Hao/Abacus

Sys-KU/AutoTiering

Sys-KU/DeepPlan

casys-kaist/HUVM

casys-kaist/CoVA

neomorphism/neomo

neoclide/coc.nvim

khakiee/comments_collector