soryxie

TongJi UniversityShangHai

soryxie's Stars

opencv/opencv
Open Source Computer Vision Library
Language:C++80k 2.7k 10.9k55.9k
pyecharts/pyecharts
🎨 Python Echarts Plotting Library
Language:Python15k 380 1.9k2.9k
continue-revolution/sd-webui-segment-anything
Segment Anything for Stable Diffusion WebUI
Language:Python3.4k 33 165208
desireevl/awesome-quantum-computing
A curated list of awesome quantum computing learning and developing resources.
2.6k 137 4409
nkaz001/hftbacktest
A high-frequency trading and market-making backtesting and trading bot in Python and Rust, which accounts for limit orders, queue positions, and latencies, utilizing full tick data for trades and order books, with real-world crypto market-making examples for Binance Futures
Language:Rust2.1k 60 101421
pytorch/ao
PyTorch native quantization and sparsity for training and inference
Language:Python1.7k 42 350197
sustcsonglin/flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Language:Python1.5k 28 6771
ray-project/kuberay
A toolkit to run Ray applications on Kubernetes
Language:Go1.4k 24 1.1k432
AIoT-MLSys-Lab/Efficient-LLMs-Survey
[TMLR 2024] Efficient Large Language Models: A Survey
1.1k 27 1188
NousResearch/DisTrO
Distributed Training Over-The-Internet
846 88 231
kvcache-ai/ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Language:Python828 17 6746
efeslab/Nanoflow
A throughput-oriented high-performance serving framework for LLMs
Language:Cuda691 8 2629
FloridSleeves/LLMDebugger
LDB: A Large Language Model Debugger via Verifying Runtime Execution Step by Step
Language:Python472 6 1546
antgroup/glake
GLake: optimizing GPU memory management and IO transmission.
Language:Python408 7 2235
microsoft/sarathi-serve
A low-latency & high-throughput serving engine for LLMs
Language:Python292 7 1935
microsoft/vattention
Dynamic Memory Management for Serving LLMs without PagedAttention
Language:C268 5 1120
AlibabaPAI/llumnix
Efficient and easy multi-instance LLM serving
Language:Python267 10 1017
Hobr/transition-ticket
Transition Ticket
Language:Python216 4 1038
kiri-art/docker-diffusers-api
Diffusers / Stable Diffusion in docker with a REST API, supporting various models, pipelines & schedulers.
Language:Python203 7 3094
Just-Prog/Bilibili_show_ticket_auto_order
Language:Python199 3 3538
gpu-mode/triton-index
Cataloging released Triton kernels.
147 7 07
microsoft/ParrotServe
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
Language:Python130 5 49
thu-nics/DiTFastAttn
Language:Jupyter Notebook117 3 108
AlibabaPAI/FLASHNN
Language:Python79 10 28
fanlai0990/CS598
Systems for GenAI
77 2 15
cchan/tccl
extensible collectives library in triton
Language:Python76 2 04
Gumpest/SparseVLMs
Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" proposed by Peking University and UC Berkeley.
Language:Python66 2 153
siyan-zhao/prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models"
Language:Jupyter Notebook57 2 12
prathebaselva/FORA
FORA introduces simple yet effective caching mechanism in Diffusion Transformer Architecture for faster inference sampling.
Language:Python32 1 32
mlsys-io/kv.run
A model serving framework for various research and production scenarios. Seamlessly built upon the PyTorch and HuggingFace ecosystem.
Language:C++21 2 02

soryxie

soryxie's Stars

opencv/opencv

pyecharts/pyecharts

continue-revolution/sd-webui-segment-anything

desireevl/awesome-quantum-computing

nkaz001/hftbacktest

pytorch/ao

sustcsonglin/flash-linear-attention

ray-project/kuberay

AIoT-MLSys-Lab/Efficient-LLMs-Survey

NousResearch/DisTrO

kvcache-ai/ktransformers

efeslab/Nanoflow

FloridSleeves/LLMDebugger

antgroup/glake

microsoft/sarathi-serve

microsoft/vattention

AlibabaPAI/llumnix

Hobr/transition-ticket

kiri-art/docker-diffusers-api

Just-Prog/Bilibili_show_ticket_auto_order

gpu-mode/triton-index

microsoft/ParrotServe

thu-nics/DiTFastAttn

AlibabaPAI/FLASHNN

fanlai0990/CS598

cchan/tccl

Gumpest/SparseVLMs

siyan-zhao/prepacking

prathebaselva/FORA

mlsys-io/kv.run