aws-neuron/transformers-neuronx

PythonApache-2.0

Issues

Neuron model NEFFs are dependent on the python path
#99 opened 2 months ago by dacorvo
2
Neuron model NEFFs are dependent on the python path
#91 opened 2 months ago by dacorvo
4
llava support
#88 opened 5 months ago by sonic182
4
Not able to load llama 3 70b on inf2.24xlarge instance
#92 opened 5 months ago by sangraamp
6
[Question] BasicTransformerBlock
#96 opened 4 months ago by JH-ninjatech
0
Latest changes introduced for continuous batching break Mixtral model
#84 opened 7 months ago by dacorvo
5
Gibberish output for princeton-nlp/Sheared-LLaMA-1.3B with continuous batching
#94 opened 4 months ago by pinak-p
2
NaN outputs when masking llama model inputs
#79 opened 4 months ago by dacorvo
8
Vicuna13B model support
#66 opened a year ago by petrovicu
1
Infering logits from `model.forward` for the entire batch instead of the last forward's output.
#73 opened a year ago by michaelfeil
6
Serving Throughput Optimizations (e.g. PagedAttention)
#52 opened 5 months ago by vigneshv59
3
Can't save/serialize any models except GPT2
#58 opened 5 months ago by awskila
4
Any plan to support Qwen-2 Model
#89 opened 5 months ago by mynewstart
0
Add support for `gemma` models
#82 opened 8 months ago by benglewis
1
Compilation error on llama 7 B with batch size 8
#59 opened 6 months ago by dacorvo
4
Avoid splitting Hugging Face Hub checkpoint files on disk
#57 opened 6 months ago by dacorvo
7
Improve Neuron model loading time
#80 opened 8 months ago by dacorvo
4
Generate Llama 2 from Embeddings
#72 opened a year ago by liechtym
5
Mixtral config issue -- not handling null well
#71 opened 8 months ago by jimburtoft
8
Add support for Baichuan-13B model
#83 opened 8 months ago by cszhz
0
Module metadata like `License` and `Home-page` are missing
#41 opened 9 months ago by massi-ang
2
LLaMA fails when the input token length is over 1790 tokens
#61 opened 9 months ago by dennj
6
Issue while compiling Mistral 7B 0.2 Instruct
#77 opened 9 months ago by josete89
5
Support for Mistral-7B model
#50 opened 9 months ago by henghui-zhu-amazon
4
Possible error in top-p filtering
#46 opened 9 months ago by dacorvo
5
save_split seems to be broken after transformers made safetensor serialization default
#55 opened 9 months ago by jitto
3
`stopping_criteria_list(input_ids, probs)` does not check for the correct sequence.
#75 opened 9 months ago by michaelfeil
4
Backward compatibility with saved llama 2 compiled artifacts
#78 opened 10 months ago by dacorvo
1
User feedback when compiling and reloading a large model
#76 opened 10 months ago by dacorvo
1
Support for MPT model
#74 opened a year ago by klutzDrawers
1
Skipping generation for useless tokens, and modiying cacheids
#68 opened a year ago by enochlev
3
Any solution to save the converted model?
#29 opened a year ago by aliseyfi
3
Inf2 Modified Llama 2 Loading Issue
#67 opened a year ago by liechtym
11
How to use generate() with inputs_embeds
#70 opened a year ago by liechtym
2
Compilation errors for llama 2 models
#45 opened a year ago by dacorvo
8
Support for encoder-decoder models
#51 opened a year ago by kwontaek-amazon
2
Mixtral Model support
#65 opened a year ago by enochlev
2
llama-2/codellama benchmark for inf2.xlarge
#64 opened a year ago by zliendo
4
Llama2 inference overhead time way too long
#63 opened a year ago by enochlev
6
Gibberish output with llama models for batch_size > 2
#34 opened a year ago by dacorvo
4
AssertionError when running fine-tuned llama 2
#40 opened a year ago by eladspi
4
About loading and saving llama model of pretraining job
#53 opened a year ago by etsurin
2
from_pretrained is broken after transformers made safetensor serialization default
#60 opened a year ago by dennj
1
Very long compilation times for llama2 with batch size 4
#48 opened a year ago by dacorvo
4
Llama compiled artifacts are not properly reloaded
#39 opened a year ago by dacorvo
4
Core dump during inference on llama2 model with batch size 4 and 1024 inputs
#49 opened a year ago by dacorvo
13
How to set: FI_EFA_FORK_SAFE=1 ?
#37 opened a year ago by yogendra-yatnalkar
2
ImportError: cannot import name 'neuron_xla_compile' from 'libneuronxla'
#33 opened a year ago by yogendra-yatnalkar
3
Corrupted output with llama prototype model
#30 opened a year ago by dacorvo
2
neuronx-cc --target
#31 opened a year ago by sheenobu
2