Issues
- 23
Converting checkpoints
#551 opened by peregilk - 7
Supported features
#571 opened by peregilk - 0
Update Inference Microbenchmark scripts
#660 opened by jon-chuang - 4
Question: Gradient Accumulation
#607 opened by thiagolaitz - 2
How to convert a model to parameter only checkpoints (unscanned) on a CPU VM
#634 opened by hosseinsarshar - 1
- 6
Document use of Mistral
#521 opened by borisdayma - 4
Reproducing pure computation TFLOPs
#624 opened by prrathi - 1
Asignación
#622 opened by Cyberwoodd - 3
- 1
- 0
Support LoRA training
#609 opened by hxssgaa - 0
Support for RecurrentGemma
#605 opened by cyrilzakka - 2
Cannot do inference in float32
#595 opened by borisdayma - 0
Support beam search
#594 opened by borisdayma - 4
Support for T5
#560 opened by kishorenc - 1
Support Qwen1.5
#585 opened by Muhtasham - 2
Gemma instructions were deleted in commit
#579 opened by emergenz - 1
Issues running test_llama2_7b.sh on TPU VM v3-8
#572 opened by korney3 - 1
`attend_dtype` not used
#531 opened by zhixuan-lin - 0
Create a user friendly inference demo
#532 opened by borisdayma - 6
TFDS Data Processing Pipline
#475 opened by LeoXinhaoLee - 2
Convert Gemma weights with scan layers
#528 opened by borisdayma - 3
Grain vs. `tf.data` Input Pipeline
#523 opened by leandrolcampos - 3
Convert Gemma weights
#527 opened by borisdayma - 7
[Bug] adam_pax has reuse donated buffer warning
#490 opened by LeoXinhaoLee - 0
Compatibility issue with tensorflow>=2.15.1 on GPU
#516 opened by chajath - 2
- 5
- 0
- 6
sharding options with grain
#477 opened by bgyoon - 1
How to use GPT2 tokenizer
#474 opened by LeoXinhaoLee - 5
setup.sh runs `rm ~/jax`
#480 opened by mattjj - 1
Can AQT be used to calculate qk score?
#478 opened by Lisennlp - 0
[Question] Loading in a HF Dataset
#471 opened by karan-dalal - 3
- 1
A pip error occurs when running setup.sh.
#433 opened by myyrakle - 1
[request] bloom (alibi) model implementation
#397 opened by bzantium - 1
- 1
Issues running decode example from readme
#413 opened by MicPie - 7
Issues running end_to_end/test_mistral.sh
#412 opened by MicPie - 3
- 2
Long sequences are dropped rather than trimmed
#274 opened by reinerp - 2
- 3
- 5
Local development instructions don't work
#245 opened by finbarrtimbers - 1
Do the Attentions / MLPs run in parallel?
#218 opened by tensorpro - 4
- 2
TPUv2-8 multislice
#158 opened by ethanhe42 - 1
You don't have to
#110 opened by scanlime