ELS-RD/kernl

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Jupyter NotebookApache-2.0

Issues

bug: Fails running dynamic shapes
#338 opened a year ago by michaelfeil
0
Confuse about block_delta
#337 opened a year ago by zhanglei1172
0
bug: optimize_model() fails on HF's GPT2 with "RuntimeError: CUDA error: operation not permitted when stream is capturing"
#336 opened a year ago by CorentinJ
4
bug: Llama model optimization failing
#317 opened 2 years ago by AndrewMead10
4
bug: Experimental/Whisper notebook (speedup.ipynb) is not working
#331 opened a year ago by Artyom17
0
bug: Torch.dynamo is not working on H100 due to obsolete triton & pytorch
#330 opened a year ago by Artyom17
0
bug: start_position support for the fused attention kernel
#329 opened a year ago by ipoletaev
0
bug: Bart speedup only 1.6x
#327 opened 2 years ago by sinking-point
0
bug: Llama reproduce error with kernl
#321 opened 2 years ago by yychen016
1
bug: How to save the optimized model to file?
#313 opened 2 years ago by aaronchan90
3
bug: does kernl support pipeline parallel?
#323 opened 2 years ago by ninisy
0
bug: Could not get kernl running on CodeT5
#283 opened 2 years ago by TheSeamau5
6
proposal: Write GEMM (matrice mulitplication) triton optimization animation
#318 opened 2 years ago by pommedeterresautee
0
Review the doc so that it displays correctly on the site ?
#290 opened 2 years ago by white-gorilla
0
bug: AttributeError: module 'torch._utils' has no attribute 'is_compiling'
#266 opened 2 years ago by gilljon
2
docs: automatic code reference generation
#280 opened 2 years ago by jonathlela
1
feature: using TorchBench to test the coverage
#305 opened 2 years ago by xuzhao9
2
bug: Triton 2.0 makes attention kernel crash
#314 opened 2 years ago by pommedeterresautee
1
bug: Trying to run T5 tutorial and getting `free(): invalid pointer` error.
#310 opened 2 years ago by gilljon
1
[FRONT] Linking kernl and the blog
#281 opened 2 years ago by white-gorilla
0
[FRONT] Upgrade kernl's M4M configuration with the Insiders version.
#306 opened 2 years ago by white-gorilla
0
Where can I install the software (on which hosting I can try it)
#253 opened 2 years ago by Oxi84
5
[FRONT] Remove/comment empty sections or put something more engaging, less crude.
#259 opened 2 years ago by white-gorilla
0
Torchdynamo + Inductor faster than kernl in t5 e2e example?
#232 opened 2 years ago by wangjunhaoumich
5
bug: tests failing at nvidia-driver-530
#304 opened 2 years ago by christallire
0
Accelerate warmup IRL?
#242 opened 2 years ago by JaheimLee
4
feature: non verbose CI
#302 opened 2 years ago by pommedeterresautee
1
Installation problem
#293 opened 2 years ago by p-christ
7
feature: run tests on CI
#289 opened 2 years ago by pommedeterresautee
2
I run the bert e2e example, if batch is not 1, I get an error!!!
#286 opened 2 years ago by lichun-wang
2
bug: Whisper "Segmentation fault (core dumped)" / inference got stuck
#272 opened 2 years ago by philschmid
6
feature: introduce int8 quant kernel
#288 opened 2 years ago by pommedeterresautee
0
[M4M Insider] Upgrade version (kernl + blog) to 4.30.2.
#258 opened 2 years ago by white-gorilla
2
bug: ERROR: Package 'kernl' requires a different Python: 3.8.10 not in '==3.9.*'
#282 opened 2 years ago by silvacarl2
1
version is still 0.1.0
#273 opened 2 years ago by JaheimLee
2
[M4M | BLOG] Updated kernl/docs CI to support M4M Insiders.
#247 opened 2 years ago by white-gorilla
0
feature: reduce memory overhead in CG
#267 opened 2 years ago by pommedeterresautee
0
bug: test test_optimized_model fails since HF transformer 4.26
#264 opened 2 years ago by pommedeterresautee
0
bug: when executed in a specific order, tests crash
#263 opened 2 years ago by pommedeterresautee
0
feature: replace Whisper script by a notebook
#262 opened 2 years ago by pommedeterresautee
0
bug: memory leak
#256 opened 2 years ago by pommedeterresautee
1
[M4M Insiders] Upgrade version.
#254 opened 2 years ago by white-gorilla
1
[M4MInsiders] Targeting a specific version of the parent repository
#252 opened 2 years ago by white-gorilla
1
[M4MInsiders] Fix automatic docker image update
#248 opened 2 years ago by white-gorilla
1
[FRONT] The contribution guide is broken.
#249 opened 2 years ago by white-gorilla
0
Add conventions to triton kernel writing
#229 opened 2 years ago by gaetansnl
0
bug: RuntimeError("GPU compute capability 8.0 (Ampere) or higher is required to use Kernl")
#246 opened 2 years ago by Oxi84
1
bug: memory leak in Pytest / CUDA Graph
#244 opened 2 years ago by pommedeterresautee
0
Cache transformed dynamo graph to speedup warmup
#234 opened 2 years ago by pommedeterresautee
1
Linter and formatter configuration incompatibility
#230 opened 2 years ago by gaetansnl
0