flashinfer-ai/flashinfer

FlashInfer: Kernel Library for LLM Serving

CudaApache-2.0

Pinned issues

[Roadmap] FlashInfer v0.1.0 release checklist

#19 opened 6 months ago by yzh119

Open5

Issues

能否支持Volta/Tesla架构？
#242 opened 19 days ago by alexngng
0
Support MLA (Multi-Head Latency Attention) in DeepSeek-v2
#237 opened 22 days ago by yzh119
0
Shared-prefix rope issue
#194 opened 2 months ago by lkc1997
1
Support torch 2.3
#227 opened 22 days ago by rkooo567
3
TypeError: get_cu_file_str() missing 1 required positional argument: 'idtype'
#222 opened a month ago by xuzhenqi
1
[Install] Build error on main branch
#195 opened a month ago by esmeetu
0
[LoRA] Roadmap of LoRA operators
#199 opened 2 months ago by yzh119
1
[Feature Request] Versatile head dimension
#142 opened 3 months ago by yzh119
4
Vllm support
#202 opened 2 months ago by MikeChenfu
0
Faster compilation times
#154 opened 3 months ago by skrider
5
[Roadmap] FlashInfer v0.1.0 release checklist
#19 opened 6 months ago by yzh119
5
Make flashinfer kernels cuda graphs friendly
#187 opened 2 months ago by AgrawalAmey
7
Compare Append Kernel's Results with Xformers
#192 opened 2 months ago by LiuXiaoxuanPKU
2
How to use low-bit KV Cache in flashinfer?
#125 opened 3 months ago by zhaoyang-star
6
Does flashinfer support float datatype?
#191 opened 2 months ago by ZSL98
0
QUESTION: C++ API support Ragged Tensor now?
#189 opened 2 months ago by yz-tang
1
Basic inference example for LLama/Mistral
#108 opened 4 months ago by vgoklani
3
How was the data in the blog measured?
#188 opened 2 months ago by cloudhan
5
falshinfer build error
#186 opened 2 months ago by yz-tang
1
Support for Volta / Turing architectures
#160 opened 3 months ago by tgaddair
4
[BUG] model Yi-34B compat
#181 opened 2 months ago by Qubitium
1
[Tracking Issue] PyTorch bindings
#64 opened 3 months ago by yzh119
0
[CI/CD][Bug] Prebuilt wheels only support PyTorch 2.1.0
#105 opened 3 months ago by yzh119
0
Could you release a wheel for Python 3.8 as well?
#129 opened 3 months ago by WoosukKwon
0
[Roadmap] 0.0.3 Release Checklist
#138 opened 3 months ago by yzh119
0
0.0.3 wheels not in flashinfer.ai/whl/
#168 opened 3 months ago by Qubitium
5
Wheels version bumping
#175 opened 3 months ago by hnyls2002
1
JIT compilation
#170 opened 3 months ago by yzh119
0
Google Gemma running error with half dtype
#157 opened 3 months ago by hnyls2002
2
stack smashing detected in begin_forward when compiling directly from the repo
#166 opened 3 months ago by mkrima
1
[Performance] Support strides in attention kernels
#163 opened 3 months ago by yzh119
0
quant support
#150 opened 3 months ago by zhyncs
1
Sliding window attention
#159 opened 3 months ago by WoosukKwon
2
Downloadable Package in PyPI
#153 opened 3 months ago by WoosukKwon
5
Still looking forward to an e2e example!
#149 opened 3 months ago by ZSL98
1
Float8 cache usage
#155 opened 3 months ago by YLGH
2
QUESTION: How to implement a tree attention with flashinfer
#152 opened 3 months ago by UranusSeven
4
Where can I find end-to-end examples?
#51 opened 3 months ago by WoosukKwon
3
[Feature request] Interleaved ROPE support
#151 opened 3 months ago by guocuimi
0
Could you support AliBi attention bias?
#137 opened 3 months ago by WoosukKwon
3
[Feature Request] More versatile GQA group sizes
#140 opened 3 months ago by yzh119
0
Can I only profile dense layer or attention layer in flashinfer rather than the whole kernel?
#139 opened 3 months ago by yintao-he
0
Suppose Gemma model shape
#130 opened 3 months ago by yzh119
1
[Compiling Issue] error: no instance of function template "flashinfer::BatchPrefillWithPagedKVCacheWrapper" matches the argument list
#134 opened 3 months ago by yintao-he
2
Please release pre-built wheels for python 3.9
#112 opened 3 months ago by merrymercy
1
Compilation error on A100 + cuda 12 + python3.9
#113 opened 3 months ago by merrymercy
0
[Tracking Issue] Documentation and Examples
#67 opened 5 months ago by yzh119
0
[Tracking Issue] Prebuilt PyPI wheels
#66 opened 4 months ago by yzh119
1
[Roadmap] FlashInfer v0.0.1 release checklist
#11 opened 4 months ago by yzh119
0
[Tracking Issue] Setting up CI and Performance Regression Testing
#65 opened 5 months ago by yzh119
0