Lightning-AI/lightning-thunder

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.

PythonApache-2.0

Issues

LitGPTSDPABenchmark runs incorrect configs
#317 opened a month ago by vedaanta
0
Provide debugging traces and options as a ENV variable or JIT option
#304 opened a month ago by parthmannan
0
NumberProxy is no Number
#272 opened 2 months ago by jjsjann123
8
Thunder + Inductor gives OOM for stablecode-completion-alpha-3b model from LitGPT
#246 opened 2 months ago by mpatel31415
2
Model phi-2 doesn't work with Thunder+Inductor compilation
#292 opened a month ago by mpatel31415
3
Expose `torch.compile` arguments as compile options
#281 opened 2 months ago by carmocca
2
implement zip lookaside in Python interpreter (enables e.g. thunder.jit with zip from LitGPT LLaMAMoE)
#284 opened 2 months ago by IvanYashchuk
6
torch.nn.TransformerEncoder hits error with thunder.jit
#289 opened a month ago by Fuzzkatt
3
`thunder.jit` fails with `nn.Softmax` raising `got an unexpected keyword argument '_stacklevel'`
#258 opened 2 months ago by ptrblck
3
Add the torch.compile executor as a test executor
#299 opened a month ago by carmocca
0
Support FSDP and torch.compile
#298 opened a month ago by carmocca
0
test_prim_inplace_copy_fwd_nvfuser_cuda_bfloat16 is flaky
#295 opened a month ago by mruberry
0
Dynamic constraints and NumberProxies
#262 opened 2 months ago by jjsjann123
0
torch.unflatten not supported by thunder.jit
#288 opened a month ago by Fuzzkatt
1
torch.nn.MultiheadAttention with thunder.jit error
#287 opened a month ago by Fuzzkatt
1
Remove all occurances of thunder.compile and TestExecutor.make_callable_legacy
#198 opened 2 months ago by IvanYashchuk
6
jit: `torch.cuda.stream` and other related functionality are silently ignored when jitting.
#280 opened 2 months ago by kshitij12345
1
`thunder.distributed.utils.sort_waits` is broken
#277 opened 2 months ago by IvanYashchuk
1
Support NeMo StableDiffusion network
#266 opened 2 months ago by athitten
3
Run `examine` automatically when hit unsupported operator failure
#270 opened 2 months ago by parthmannan
0
executor should be able to put checks into prologue trace
#263 opened 2 months ago by jjsjann123
0
Implement TensorBase.gather
#267 opened 2 months ago by athitten
0
Implement TensorBase.long
#268 opened 2 months ago by athitten
0
Implement _VariableFunctionsClass.randint of torch
#269 opened 2 months ago by athitten
0
[ci] : We should add a CI flow with TransformerEngine installed so that we can run the relevant tests.
#196 opened 2 months ago by kshitij12345
5
Timeout for Platypus-30B and Thunder compile
#294 opened a month ago by mpatel31415
0
getitem grad is not calculated properly on Windows
#296 opened a month ago by mruberry
0
`torch.Tensor.numel` method : Don't know how to interpret a callable with type <class 'int'>
#240 opened 2 months ago by kshitij12345
8
thunder.jit does not work for autocast transform in some cases
#283 opened 2 months ago by kiya00
0
increase of GPU memory footprint
#216 opened 2 months ago by mpatel31415
7
Long compilation time
#229 opened 2 months ago by mpatel31415
1
Feature request: Support sharding parameters where first dimension is not divisible by 8
#248 opened 2 months ago by mpatel31415
1
value_and_grad returns None gradients with thunder.jit
#211 opened 2 months ago by kshitij12345
3
cuDNN SDPA executor has CPU overhead
#241 opened 2 months ago by parthmannan
5
Add support for torch.gather
#223 opened 2 months ago by IvanYashchuk
0
Add sanity check for primitive inplace copy operator
#265 opened 2 months ago by kiya00
2
benchmarking — create a notebook showing how to work with the single gpu benchmarks
#205 opened 2 months ago by mruberry
2
cuDNN executor leaks to some other benchmarks if run first in the list
#230 opened 2 months ago by IvanYashchuk
4
Add support for FP8E4M3 and FP8E5M2 dtypes
#254 opened 2 months ago by IvanYashchuk
0
If saved_for_backward returns NumberProxy, the value is taken from compile time, not runtime
#231 opened 2 months ago by kiya00
10
Setup a nightly job that runs the benchmarking suite
#226 opened 2 months ago by riccardofelluga
0
`exec()` and `eval()` lookasides ignore `locals` argument
#227 opened 2 months ago by apaz-cli
1
Benchmarking suite that runs scripts
#224 opened 2 months ago by riccardofelluga
0
Enable xfailed tests from test_apex_executor.py
#220 opened 2 months ago by IvanYashchuk
0
optimizer: jitting the optimizer step
#204 opened 2 months ago by kshitij12345
0
test_grad.py assumes all ops in OpInfo database can be run on CPU
#199 opened 2 months ago by kshitij12345
0
Support non_blocking in Tensor.to
#197 opened 2 months ago by kshitij12345
0
Mixtral 8x7B network support
#194 opened 2 months ago by riccardofelluga
3
Support `torch.nonzero`
#195 opened 2 months ago by riccardofelluga
0
einsum performance regression after disabling bookend optimization
#193 opened 2 months ago by jjsjann123
0