Tested with TensorFlow master branch (supposedly, using tf-nightly should be fine)
- update group normalization, if you run into related problems, keras-team/keras-cv#1035
- text encoder and decoder: Converting the text_encoder and decoder is trivial.
- diffusion model: Converting the diffusion model needs extra effort. The model weights of the diffusion models is about 3.4 GiB, which is much larger than file size limit (2 GiB) of flatbuffer, the format TFLite used for its models. Surely, it's possible to modify flatbuffer to use 64-bit offset so that we can overcome the 2 GiB limit, but that will result in imcompatible files.
It's also possible to generate Quantized int8 models with Post-Training Quantization (PTQ). I don't know how to generate representative datasets, but as a proof of concept, I wrote a script that uses only one sample input as the dataset :-)
Conversion script
Borrowing some code from Keras CV implementation, we can do end-to-end test of converted TFLite models
- dynamice range quantized models: notebook The 'man-on-skateboard-cropped.png' is from this tutorial
I put tflite models I converted to HuggingFace