Issues
- 6
https://us-python.pkg.dev/gce-ai-infra/maxtext-build-support-packages/simple/ not public
#758 opened by emergenz - 0
Flash attention - head_dim 64
#1047 opened by peregilk - 0
Why logit checker has such a high tolerance?
#1021 opened by hugoabonizio - 4
- 3
PGLE doesn't work for Tensor Parallelism
#1005 opened by wang2yn84 - 5
Cannot do inference in float32
#595 opened by borisdayma - 4
Long Context
#801 opened by peregilk - 26
- 1
nucleus top_p sampling seems wrong? (edit: nvm, read and tested the code wrong)
#950 opened by honglu2875 - 4
Training more than one epoch
#914 opened by peregilk - 2
- 0
Support nsys profiler upload in all cases
#911 opened by gobbleturk - 2
- 0
- 0
- 2
Unable to recover after checkpoint saving
#868 opened by peregilk - 0
Support beam search
#594 opened by borisdayma - 0
Support for RecurrentGemma
#605 opened by cyrilzakka - 2
- 2
Support LoRA training
#609 opened by hxssgaa - 0
- 2
- 0
- 1
How to implement 1F1B pipeline parallelism in Jax?
#752 opened by MoFHeka - 0
Inconsistent environment variable names
#775 opened by gabeweisz - 3
- 0
Make MaxText as Python Modules
#819 opened by JoeZijunZhou - 3
Converting LLama3.1 405B checkpoint - Requesting multipass checkpoint conversion
#864 opened by shivajid - 0
- 0
Inconsistent code formatting
#735 opened by jmschndev - 14
- 11
mlperf gpt3 ckpt permission issues
#847 opened by gramesh-amd - 7
Cannot load the paxml gpt3 tokenizer
#875 opened by gramesh-amd - 3
- 6
Question: Gradient Accumulation
#607 opened by thiagolaitz - 1
FlashAttention Support - TPUv3
#791 opened by maciek-pioro - 1
aqtp release 0.8.0 breaking dependencies
#849 opened by bernardhan33 - 3
Gemma 2 support
#733 opened by borisdayma - 0
`hf_access_token` only effective for loading gated datasets, not gated tokenizers
#734 opened by jmschndev - 1
Outdated links in `First_run.md`
#776 opened by emergenz - 1
Eval on C4?
#711 opened by tjingrant - 0
Update Inference Microbenchmark scripts
#660 opened by jon-chuang - 2
How to convert a model to parameter only checkpoints (unscanned) on a CPU VM
#634 opened by hosseinsarshar - 4
Reproducing pure computation TFLOPs
#624 opened by prrathi - 1
Asignación
#622 opened by Cyberwoodd - 3
- 1
- 1
Support Qwen1.5
#585 opened by Muhtasham - 2
Gemma instructions were deleted in commit
#579 opened by emergenz - 1
Issues running test_llama2_7b.sh on TPU VM v3-8
#572 opened by korney3