This repository includes code samples that use the DBSQL REST API.
To setup the Spreadsheet application follow these steps:
-
Create a new Google Spreadsheet file called
Orders
. -
Inside, create a sheet called
Orders
. -
Navigate to
Extensions -> Apps Script
and rename the script toOrders
. -
In the Apps Script editor, create four files with the names and content of the files located in the
spreadsheet
folder:Interface.gs
,DBSQL.gs
,sidebar.html
, andinfo.html
. -
Inside the
DBSQL.gs
file, provide the connectivity parameters for the Databricks server:HOST
,WAREHOUSE_ID
, andAUTH_TOKEN
. -
Save the files.
-
Go back to your Spreadsheet and refresh it.
-
You will now get a
Managed Orders
menu next to theExtensions
menu. -
Select
Managed Orders -> Show Monthly Orders
and accept the authorization request. -
To configure the permissions in your sheet, select
Project Settings -> Show appscript.json manifest file in editor
. -
This will show a new file in the App Script workspace where you can edit the grants. See more details.
-
You are now all set. Run the predefined queries using the entries in the
Managed Orders
menu.
The Postman collection consists of two json
files Databricks Environment.postman_environment.json
and Databricks SQL Execution API.postman_collection.json
. The former can be used to define environment variables needed to establish connection with the Databricks server: HOST
, WAREHOUSE_ID
, and AUTH_TOKEN
. The latter includes several API requests using the sync and async flows.
To use the Postman collection follow the steps below:
-
Open Postman, go to
My Workspace
and click Import to add the two files. -
Go to
Environments
and selectDatabricks Environment
to update the connection parameters. Make sure to save the changes. -
Select the
Databricks Enviroment
from the environment selector. -
Run one of the API requests for executing statements.
-
Use the GET requests to poll for status and fetch chunks.
The Python script executes a statement in asynchronous mode with disposition
and format
set to EXTERNAL_LINKS
and ARROW_STREAM
, respectively. Then, it retrieves each chunk using the presigned URL, deserializes the Arrow bytes and converts the result into a pandas DataFrame
.