Welcome to the Gemma Cookbook

This is a collection of guides and examples for Google Gemma. Gemma is a family of lightweight, state-of-the art open models built from the same research and technology used to create the Gemini models.

Get started with the Gemma models

Gemma is a family of lightweight, state-of-the art open models built from the same research and technology used to create the Gemini models. The Gemma model family includes:

base Gemma
- Gemma
- Gemma 2
Gemma variants

You can find the Gemma models on GitHub, Hugging Face models, Kaggle, Google Cloud Vertex AI Model Garden, and ai.nvidia.com.

Partner quickstart guides

Company	Description
Hugging Face	Utilize Hugging Face Transformers and TRL for fine-tuning and inference tasks with Gemma models.
NVIDIA	Fine-tune Gemma models with NVIDIA NeMo Framework and export to TensorRT-LLM for production.
LangChain	This tutorial shows you how to get started with Gemma and LangChain, running in Google Cloud or in your Colab environment.
MongoDB	This article presents how to leverage Gemma as the foundation model in a retrieval-augmented generation pipeline or system.

Workshops and technical talks

Notebook	Description
Workshop_How_to_Fine_tuning_Gemma.ipynb	Recommended finetuning notebook for getting started
Self_extend_Gemma.ipynb	Self-extend context window for Gemma in the I/O 2024 Keras talk
Gemma_control_vectors.ipynb	Implement control vectors with Gemma in the I/O 2024 Keras talk

Accompanying notebooks for the Build with AI video series

Folder
Business email assistant
Personal code assistant
Spoken language tasks

Cookbook table of contents

Gemma model overview
Common_use_cases.ipynb	Illustrate some common use cases for Gemma, CodeGemma and PaliGemma.

Gemma

Inference and serving
Keras_Gemma_2_Quickstart.ipynb	Gemma 2 pre-trained 9B model quickstart tutorial with Keras.
Keras_Gemma_2_Quickstart_Chat.ipynb	Gemma 2 instruction-tuned 9B model quickstart tutorial with Keras. Referenced in this blog.
Gemma inference with Flax/NNX	Gemma 1 inference with Flax/NNX framework (linking to Flax documentation)
Chat_and_distributed_pirate_tuning.ipynb	Chat with Gemma 7B and finetune it so that it generates responses in pirates' tone.
gemma_inference_on_tpu.ipynb	Basic inference of Gemma with JAX/Flax on TPU.
gemma_data_parallel_inference_in_jax_tpu.ipynb	Parallel inference of Gemma with JAX/Flax on TPU.
Gemma_Basics_with_HF.ipynb	Load, run, finetune and deploy Gemma using Hugging Face.
Gemma_with_Langfun_and_LlamaCpp.ipynb	Leverage Langfun to seamlessly integrate natural language with programming using Gemma 2 and LlamaCpp.
Gemma_with_Langfun_and_LlamaCpp_Python_Bindings.ipynb	Leverage Langfun for smooth language-program interaction with Gemma 2 and llama-cpp-python.
Guess_the_word.ipynb	Play a word guessing game with Gemma using Keras.
Game_Design_Brainstorming.ipynb	Use Gemma to brainstorm ideas during game design using Keras.
Translator_of_Old_Korean_Literature.ipynb	Use Gemma to translate old Korean literature using Keras.
Gemma2_on_Groq.ipynb	Leverage the free Gemma 2 9B IT model hosted on Groq (super fast speed).
Run_with_Ollama.ipynb	Run Gemma models using Ollama.
Run_with_Ollama_Python.ipynb	Run Gemma models using Ollama Python library.
Using_Gemma_with_Llamafile.ipynb	Run Gemma models using Llamafile.
Using_Gemma_with_LlamaCpp.ipynb	Run Gemma models using LlamaCpp.
Using_Gemma_with_LocalGemma.ipynb	Run Gemma models using Local Gemma.
Using_Gemini_and_Gemma_with_RouteLLM.ipynb	Route Gemma and Gemini models using RouteLLM.
Using_Gemma_with_SGLang.ipynb	Run Gemma models using SGLang.
Using_Gemma_with_Xinference.ipynb	Run Gemma models using Xinference.
Constrained_generation_with_Gemma.ipynb	Constrained generation with Gemma models using LlamaCpp and Guidance.
Integrate_with_Mesop.ipynb	Integrate Gemma with Google Mesop.
Integrate_with_OneTwo.ipynb	Integrate Gemma with Google OneTwo.
Deploy_with_vLLM.ipynb	Deploy a Gemma model using vLLM.
Deploy_Gemma_in_Vertex_AI.ipynb	Deploy a Gemma model using Vertex AI.
Prompting
Prompt_chaining.ipynb	Illustrate prompt chaining and iterative generation with Gemma.
LangChain_chaining.ipynb	Illustrate LangChain chaining with Gemma.
Advanced_Prompting_Techniques.ipynb	Illustrate advanced prompting techniques with Gemma.
RAG
RAG_with_ChromaDB.ipynb	Build a Retrieval Augmented Generation (RAG) system with Gemma using ChromaDB and Hugging Face.
Minimal_RAG.ipynb	Minimal example of building a RAG system with Gemma using Google UniSim and Hugging Face.
RAG_PDF_Search_in_multiple_documents_on_Colab.ipynb	RAG PDF Search in multiple documents using Gemma 2 2B on Google Colab.
Using_Gemma_with_LangChain.ipynb	Examples to demonstrate using Gemma with LangChain.
Using_Gemma_with_Elasticsearch_and_LangChain.ipynb	Example to demonstrate using Gemma with Elasticsearch, Ollama and LangChain.
Gemma_with_Firebase_Genkit_and_Ollama.ipynb	Example to demonstrate using Gemma with Firebase Genkit and Ollama
Gemma_RAG_LlamaIndex.ipynb	RAG example with LlamaIndex using Gemma.
Finetuning
Finetune_with_Axolotl.ipynb	Finetune Gemma using Axolotl.
Finetune_with_XTuner.ipynb	Finetune Gemma using XTuner.
Finetune_with_LLaMA_Factory.ipynb	Finetune Gemma using LLaMA-Factory.
Finetune_with_Torch_XLA.ipynb	Finetune Gemma using PyTorch/XLA.
Finetune_with_JORA.ipynb	Finetune Gemma using JORA.
Finetune_with_Unsloth.ipynb	Finetune Gemma using Unsloth.
Finetune_with_LitGPT.ipynb	Finetune Gemma using LitGPT.
Custom_Vocabulary.ipynb	Demonstrate how to use a custom vocabulary "<unused[0-98]>" tokens in Gemma.
Alignment
Aligning_DPO_Gemma_2b_it.ipynb	Demonstrate how to align a Gemma model using DPO (Direct Preference Optimization) with Hugging Face TRL.
Evaluation
Gemma_evaluation.ipynb	Demonstrate how to use Eleuther AI's LM evaluation harness to perform model evaluation on Gemma.
Mobile
Gemma on Android	Android app to deploy fine-tuned Gemma-2B-it model using MediaPipe LLM Inference API.

PaliGemma

Inference
Image_captioning_using_PaliGemma.ipynb	Use PaliGemma to generate image captions using Keras.
Image_captioning_using_finetuned_PaliGemma.ipynb	Compare the image captioning results using different PaliGemma versions with Hugging Face.
Finetune_PaliGemma_for_image_description.ipynb	Finetune PaliGemma for image description using JAX.
Integrate_PaliGemma_with_Mesop.ipynb	Integrate PaliGemma with Google Mesop.
Zero_shot_object_detection_in_images_using_PaliGemma.ipynb	Zero-shot Object Detection in images using PaliGemma.
Zero_shot_object_detection_in_videos_using_PaliGemma.ipynb	Zero-shot Object Detection in videos using PaliGemma.
Referring_expression_segmentation_in_images_using_PaliGemma.ipynb	Referring Expression Segmentation in images using PaliGemma.
Referring_expression_segmentation_in_videos_using_PaliGemma.ipynb	Referring Expression Segmentation in videos using PaliGemma.
Finetuning
Finetune_PaliGemma_with_Keras.ipynb	Finetune PaliGemma with Keras.
Finetune_PaliGemma_for_object_detection.ipynb	Fine-tune PaliGemma for object detection on a fashion dataset using JAX.
Mobile
PaliGemma on Android	Inference PaliGemma on Android using Hugging Face and Gradio Client API for tasks like zero-shot object detection, image captioning, and visual question-answering.

CodeGemma

Finetuning
CodeGemma_finetuned_on_SQL_with_HF.ipynb	Fine-Tuning CodeGemma on the SQL Spider Dataset.

Get help

Ask a Gemma cookbook-related question on the developer forum, or open an issue on GitHub.

Wish list

If you want to see additional cookbooks implemented for specific features/integrations, please send us a pull request by adding your feature request(s) in the wish list.

If you want to make contributions to the Gemma Cookbook project, you are welcome to pick any idea in the wish list and implement it.

Contributing

Contributions are always welcome. Please read contributing before implementation.

Thank you for developing with Gemma! We’re excited to see what you create.

jethac/gemma-cookbook