- 1. Overview
- 2. Solutions architecture overview
- 3. Deploy the infrastructure using Terraform
- 4. Deploying the Looker Extension
- 5. Using and Configuring the Extension
- 6. Developing the Looker Extension Environment
This repository compiles prescriptive code samples demonstrating how to create a Looker Extension integrating Looker with Vertex AI Large Language Models (LLMs).
Looker GenAI is an extension created to showcase interactivity between Looker and LLM with 2 main applications:
1.1. Generative Explore: Using Natural Language to ask your data about specific things. The LLM Model will try to find the right fields, filters, sorts, pivots and limits to explore the data.
1.2. Generative Insights on Dashboards: With this feature, we ingest all the data from the selected Dashboard as a context and can ask the LLM model a question based on the context provided.
There are two tabs on the extension:
User chooses a Looker Explore and asks questions using natural language. The application gathers the metadata from the explore and creates a prompt to the LLM model that will return an explore with the appropriate fields, filters, sorts and pivots rendered on the Extension. The user can select a Visualization and add it to a Dashboard.
The current default implementation uses the native integration between BigQuery and LLM models using BQML Remote Models [https://cloud.google.com/bigquery/docs/generate-text]
For production environment, it is recommended to use the deployment using BigQuery Remote UDFs, which uses a Google Cloud Function that will connect to Vertex AI API's. This option is more flexible as it is easy to change the Model, parameters and also give a fine-tuned endpoint (when gemini-pro supports it in the near future.) [https://cloud.google.com/bigquery/docs/remote-functions]
Optionally, users can train their own custom fine tune model, giving more examples to make it more accurate than the default model. If users want to follow this path, on this repo there is a Terraform Deployment Example on how to achieve that using Cloud Workflows to orchestrate the creation of the Fine Tuned Model. After fine-tuning the model, you can change the endpoint on the Cloud Function (Remote UDF option).
User chooses a Looker Dashboard and asks questions using natural language. In this scenario, the Extension builds a prompt and sends all the data from all tiles to the LLM model as a context and the question from the user.
The architecture for the extension needs the following infrastructure in a GCP Project:
- BigQuery Dataset (default name: llm)
- BigQuery Remote Model pointing to gemini-pro API (llm_model)
- IAM Service Accounts to create a connection to Looker
- IAM permission for BQ connection to connect to Vertex AI
These instructions will guide you through the process of installing the extension-gen-ai
required resources using your Google Cloud Shell.
First, clone the repository to Cloud Shell
cloudshell_open --repo_url "https://github.com/looker-open-source/extension-gen-ai" --page "shell" --open_workspace "deployment/terraform" --force_new_clone
Or run directly on your Cloud Shell session:
Set the gcloud
command to use the desired project ID:
gcloud config set project PROJECT-ID
Run the script to create the Terraform state buckets:
sh scripts/create-state-bucket.sh
Initialize the Terraform modules:
terraform init
Deploy the Terraform resources:
terraform apply -var="project_id=YOUR_PROJECT_ID"
While your terraform is executing, follow instructions for 4. Deploying the Looker Extension or 5.Developing and Extending the Extension
The Extension will be available directly through Marketplace or through a manual deployment described below:
-
Log in to Looker and create a new project named
looker-genai
.Depending on the version of Looker, a new project can be created under:
- Develop => Manage LookML Projects => New LookML Project, or
- Develop => Projects => New LookML Project
Select "Blank Project" as the "Starting Point". This creates a new LookML project with no files.
-
In this github repository, there is a folder named
looker-project-structure
, containing 3 files:
-
manifest.lkml
-
looker-genai.model
-
bundle.js
Drag and drop all the 3 files to the project folder.
-
Change the
looker-genai.model
to include the looker connection to BigQuery that will do.In this step you can create a new connection and use the service account generated from the terraform or use an existing Connection from Looker. If you use an existing connection, make sure to give the right IAM permission to the service account, so it can query and use the newly created connection and model.
-
Connect the new project to Git.
Create a new repository on GitHub or a similar service, and follow the instructions to connect your project to Git or setup a bare repository.
-
Commit the changes and deploy them to production through the Project UI.
-
Make sure that the project has permission to use this connection.
- Develop => Projects => Configure ==> Select ONLY the connection that will be used to connect to BigQuery for the Extension LLM application
-
Manually go the GCP Project, and make sure that the service account with the connection has permission to use the new created connection on the new llm dataset.
-
Test the Extension. Open the Web Developer Console on the Browser to see errors or debug. Verify on your GCP project that the queries are coming to BigQuery and executing properly.
-
If you have any doubts, questions, feel free to e-mail: looker-genai-extension@google.com. We also have a debug table in BigQuery called explore_logs which you can export to CSV and send to us.
INSERT INTO `llm.explore_prompts`
VALUES("Top 3 brands in sales", "What are the top 3 brands that had the most sales price in the last 4 months?", "thelook.order_items", "explore")
The values to be inserted are as the following: name of example, prompt, model.explore, type (explore or dashboard)
All user-level settings are stored within your BigQuery project under the llm
dataset. You can manage these settings in the "Developer Settings" tab. Adjustable configurations include:
- Console Log Level: Controls the verbosity of logs sent to the console.
- Use Native BQML or Remote UDF: Determines whether to use native BigQuery ML functions or custom remote User-Defined Functions (UDFs). Remote UDFs are generally recommended for production workloads.
- Custom Prompt to be used for your userId.
Modifying Settings with SQL in BigQuery
This SQL below changes the settings for all users.
UPDATE `llm.settings` SET config = (SELECT config from `llm.settings` WHERE userId = "YOUR_USER_ID") WHERE True
The default settins is when userId is NULL; You can change just for the default settings if you want.
UPDATE `llm.settings` SET config = (SELECT config from `llm.settings` WHERE userId = "YOUR_USER_ID") WHERE userId IS NULL
You can follow all the steps from Deploying the extension.
On the manifest.lkml
comment the file and put the url to localhost
project_name: "looker-genai"
application: looker-genai {
label: "Looker GenAI Extension"
url: "https://localhost:8080/bundle.js"
# Comment production file: "bundle.js"
entitlements: {
use_embeds: yes
use_form_submit: yes
use_iframes: yes
external_api_urls: ["https://localhost:8080","http://localhost:8080"]
core_api_methods: ["run_inline_query", "me", "all_looks", "run_look", "all_lookml_models", "run_sql_query", "create_sql_query",
"lookml_model_explore", "create_query", "use_iframes", "use_embeds", "use_form_submit",
"all_dashboards", "dashboard_dashboard_elements", "run_query", "dashboard", "lookml_model"] #Add more entitlements here as you develop new functionality
}
}
6.1. Install the dependencies with Yarn
yarn install
yarn develop
The development server is now running and serving the JavaScript at https://localhost:8080/bundle.js.
Execute the yarn build to generate the dist/bundle.js
, and commit to the LookML project
Make sure to the manifest pointing to local prod file: "bundle.js
"
yarn build
Vertex and LLM Backends To execute fine tune model there is a sample terraform script provided on the repo.
The architecture needs the following infrastructure:
- VertexAI Fine Tuned LLM Model with the Looker App Examples
- Cloud Function that will call the Vertex AI Tuned Model Endpoint
- BigQuery Datasets, Connections and Remote UDF that will call the Cloud Function
TODO: The code have to be refactored to allow for the custom fine tuned model using BQ, Remote UDF and Cloud Function.
Inside gcloud
environment, invoke the Cloud Workflows
gcloud workflows execute fine_tuning_model
Refactor the SQL endpoints to use the new SQL syntax to use UDFs and BigQuery (Can check earlier commits on the repo)