keras-team/keras-nlp

How to add a serialized model and weights of a keras model to keras-nlp?

Opened this issue · 4 comments

Hello, I'm having some problems with the codebase of the keras_nlp, in case anyone understands the codebase and how model interact within, feel free to say.

  • Suppose, I do have a Keras Model already with pre-trained weights and it works fine and flawless.
  • Then What is the easiest way to add it into keras-nlp, given, I have a code that needs to be integrated with keras-nlp modules that should be able to download the weights from Kaggle and then run here?
  • I have uploaded the weights on Kaggle along with the code via serialization utils, and now I need to get it into keras-nlp too.
  • Model loading the presets is confusing me the most actually in the code.
  • Also there isn't any specific docs for contributors who don't want to contribute to keras-nlp at the moment, but want to add models here for easier access to keras users.

Thank You!

Here's the best of my understanding.

I didn't get the def presets(cls) at the end of each backbone file of the model, although they seem repetitive, the rest all seem okay.

I'm bit confused with task.py and generative_task.py files here: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/task.py , https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/generative_task.py ....

Great question! and the short answer is this is on the way. There's three things that are currently in flight.

  1. An easy upload API (no need for preset utils). Something like model.upload_preset(kaggle_handle).
  2. Public model uploads for all Kaggle users (they might have actually flipped the bit for this, will check).
  3. A guide showing the basic flows, so this doesn't require code spelunking.

We should have all of this ready in a couple weeks! Will keep you posted!

@mattdangerw I have a few models that I've transitioned architecture (and weights) from a PyT HF code where the weights can be added and inferencing is possible (NOT training for now, I'm working on that part). Should I subclass backbone and upload weights to Kaggle and PR the code here with inputs and outputs in __init__() (as other models do?)?

I'm not sure how it's done, but I looked into some code of T5/XLNet Transformer for it.

Additionally, I'm looking to port YOLO 9 into KerasCV after I experiment a few more things and understand here.

Just in case note that Blenderbot 400M (distilled) version has been ported to Keras v3. It can be extended to other 90M to 2B versions from HF too easily. Link: https://github.com/abhaskumarsinha/Keras-Blenderbot

Just tag me once everything gets ready, I've a few more models that I'm working on to add here!

[Note: I haven't used Keras tokenizer as BPE from keras-nlp requires Keras v2 and isn't compatible with Keras v3, but I've tested it enough to work once vocab and merge files are ready, that just needs an upgrade, so, I'm using HF tokenizer only in the colab demo!]