Issues
- 0
Any plans for Llama 3?
#1583 opened by awsaf49 - 2
Preprocessor does not respect sequence_length
#1627 opened by 52631 - 2
Distributed batch size not calculated correctly
#1630 opened by natbprice - 0
- 0
Cannot export a slightly customized XLMRoberta model from keras_nlp
#1629 opened by YangIsNotAvailable - 0
Add support for `PaliGemma`
#1626 opened by awsaf49 - 1
DebertaV3MaskedLM example don't work
#1622 opened by mrektor - 0
unable to diagnose OOM
#1628 opened by josharian - 0
- 2
403 KaggleApiHTTPError while running GemmaCausalLM
#1625 opened by nashschool - 1
Any plans for more Llama type models?
#1587 opened by pass-lin - 4
Issue instantiating a keras_nlp.models.Backbone from a model preset of Hugging Face handles
#1574 opened by RandomWalkie - 4
Gemma Model Storing and Loading after Fine tuning
#1482 opened by kreouzisv - 2
GemmaBackbone.get_layout_map broken for gemma_2b_en
#1613 opened by josharian - 3
Issue when fine-tuning Albert - Resource localhost/_0_SentencepieceOp/N10tensorflow4text12_GLOBAL__N_121SentencepieceResourceE does not exist.
#1573 opened by deathsaber - 8
i installed keras-nlp in pycharm IDE , when in run
#1426 opened by said-ml - 3
Cannot reproduce results from notebook on Colab
#1592 opened by jespernwulff - 12
keras-nlp insists I use the (buggy) Tensorflow 2.16.1 which does not work with my GPU
#1519 opened by nas-mouti - 7
[RfC] Ideas for better Hugging Face Hub integration
#1529 opened by Wauplin - 2
Any plans for QLora?
#1537 opened by asmith26 - 1
Update ByteTokenizer to remove TensorFlow dependency
#1469 opened by stereoplegic - 0
cannot import name 'CachedMultiHeadAttention' from partially initialized module 'keras_nlp.src.layers.modeling.cached_multi_head_attention' (most likely due to a circular import)
#1427 opened by anilmamidwar15021991 - 6
Samplers in Gemma model
#1588 opened by mostafamdy - 0
- 0
Any plans for moreLlama 3?
#1586 opened by pass-lin - 3
- 3
Add Electra Weights to Kaggle Models
#1422 opened by pranavvp16 - 4
Data-Parallel Training with KerasNLP and tf.distribute example dataset problem
#1504 opened by sitamgithub-MSIT - 3
- 4
- 4
create local variable per_token_loss in score method to global. So that we can modify loss function.
#1539 opened by deveshklt - 2
`SentencePieceTokenizer` inside a `keras.models.Model` fails to be reconstructed during `keras.saving.load_model()`
#1522 opened by briango28 - 0
Add grok-1
#1525 opened by innat - 3
Feature Request: Transformer Debugger - Debugging and controlling the behavior of transformer based LLM models.
#1513 opened by abhaskumarsinha - 3
Add Mistral 0.2 models as possible presets
#1515 opened by borisdayma - 1
Gemma discrepancies
#1494 opened by awsaf49 - 5
Question about Gemma tensor parallel sharding policy
#1464 opened by AIGideon - 5
Model weights contributions?
#1463 opened by deep-diver - 0
Keras_NLP and Kaggle Hub: Are models allowed without weights in Kaggle Hub?
#1433 opened by abhaskumarsinha - 4
How to add a serialized model and weights of a keras model to keras-nlp?
#1479 opened by abhaskumarsinha - 2
- 1
Add CLIP tokenizer to Keras NLP
#1453 opened by divyashreepathihalli - 1
Add `oov_token` Argument to `BytePairTokenizer`
#1466 opened by abuelnasr0 - 2
- 1
ContrastiveSampler lacks a seed param, while the docstring states it has one
#1481 opened by martin-gorner - 4
Any guide how to use tools/gemma/run_gemma_xla.py?
#1461 opened by deep-diver - 2
Mistral kills the process by taking too many RAM
#1458 opened by deep-diver - 26
Preset and doc for Mistral (multilingual)
#1418 opened by federicoparra - 0
Issue with `BytePairTokenizer`
#1435 opened by abuelnasr0 - 0