ABSQL is a format for templated SQL, and the ABSQL package is a rendering engine for this format.
ABSQL allows users to inject limitless context - text, objects, and functions - into their SQL templates, both within the rendering engine (the Runner) and in the SQL file itself.
The flexibility and extendibility of ABSQL affords SQL engineers a consistent interface for generating templated SQL, while simultaneously giving downstream orchestration solutions (e.g. Airflow) the opportunity to develop orchestrator-specific implementations for ABSQL.
ABSQL leverages the familiar Jinja2 templating engine, as well as YAML-style frontmatter to provide users with a unique and intuitive SQL authoring experience.
The ABSQL Runner is robust, but we will start simple. The static render_text
method will render user-provided text with any provided context.
from absql import Runner
Runner.render_text('{{greeting}}, World!', greeting="Hello")
# 'Hello, World!'
We might want to reuse the context we provide. In this case, we can instantiate a runner and provide it with context. We can then render any text with this saved context. An instantiated runner has a render
method, that works with text and - as we will see later - file paths, as well.
from absql import Runner
r = Runner(greeting = "Hello")
r.render("{{greeting}}, World!")
# 'Hello, World!'
The runner is not bound to plain text, and the context provided by the user is not confined to just the runner. Context can also be passed in through a frontmatter block at the top of a .sql
file. This frontmatter block is parsed as YAML.
In the example file below, my_file.sql
, the table_name
context is provided in the frontmatter block, which starts and ends with three hyphens (---
). The body of the file contains the templated SQL we want to render.
---
table_name: my_table
---
SELECT *
FROM {{table_name}}
WHERE greeting = '{{greeting}}'
Now, using the same runner as above, we can render the SQL, with our table_name
context being provided from the file itself, and the greeting
context being fed in from the runner.
r.render("my_file.sql")
# SELECT *
# FROM my_table
# WHERE greeting = 'Hello'
There are cases when templated queries require data from a database. In ABSQL, this can be accomplished by providing a SQLAlchemy engine to a runner, and using some of the default ABSQL-provided functions. While we could instantiate a new runner with this engine, we will use our existing runner's set_context
method to add a SQLite engine. It is important that we set the engine as engine
in the context.
from sqlalchemy import create_engine
engine = create_engine("sqlite:///:memory:")
r.set_context(engine=engine)
Now, we can update our query in my_file.sql
to use ABSQL's builtin table_exists
function. We will arbitrarily add a limit to our query, based on if a table exists.
---
table_name: my_table
---
SELECT *
FROM {{table_name}}
WHERE greeting = '{{greeting}}'
{% if table_exists("a_nonexistent_table") %}
LIMIT 10
{% else %}
LIMIT 1
{% endif %}
Then, we can render again.
r.render("my_file.sql")
# SELECT *
# FROM my_table
# WHERE greeting = 'Hello'
# LIMIT 1
See below for a full list of builtin functions.
The following functions are available in any instantiated runner:
Category | Function | Description | Example |
---|---|---|---|
database | get_max_value |
Get the maximum value of a column. | get_max_value("my_schema.my_table.my_column") |
database | get_min_value |
Get the minimum value of a column. | get_min_value("my_schema.my_table.my_column") |
database | table_exists |
TRUE/FALSE if the table exists in the database. | table_exists("my_schema.my_table") |
database | query_db |
Send any query to the database and get back the results. | query_db("SELECT name FROM my_schema.my_table") |
env | env_switch |
Switch a value based on the given environment (an environment variable of "ENV" by default). | env_switch(dev=10, prod=500, default=30) |
env | env_var |
Retrieve a value from an environment variable. | env_var("MY_ENVIRONMENT_VARIABLE") |
time | datetime |
Python's datetime.datetime function. |
datetime(2022, 1, 1) |
time | previous_date |
Get a date string for the previous date (based on UTC by default). | previous_date() |
time | previous_hour |
Get a datetime string the previous hour (based on UTC by default). | previous_hour() |
time | timedelta |
Python's datetime.timedelta function. |
timedelta(hours=1) |
For the database functions, it is recommended that you pass in an engine to the runner ahead of rendering (e.g. runner = Runner(engine=engine)
). If no engine is provided, the runner will search for a URI in the environment variable AB__URI
.
Also note that these functions are accessible in both the frontmatter and body of a file. Below is an example illustrating this.
---
schema: "{{ env_switch(prod='prod_schema', dev=env_var('DEV_SCHEMA')) }}"
---
SELECT *
FROM {{schema}}.my_table
As mentioned elsewhere, users can add functions and context:
- In a file's frontmatter
- When instantiating a new runner
- By using the
set_context
method on an existing runner
Note that functions which contain a keyword argument engine
will automatically be passed the engine object attached to the runner by default. Users can set desired keyword arguments to automatically be added to functions by passing a list of keyword argument names to the partial_kwargs
argument, available in any of the render_[x]
functions, as well as when instantiating a Runner object.
While no configuration is required to get started with ABSQL, there are two environment variables you can set:
AB__URI
- The URI for the database that should be used by ABSQL's default database functions. As mentioned, it is advisable that you just pass in your own SQLAlchemy engine object to a runner under the context nameengine
, if you are going to use the database functions.AB__ENV
- The name of the environment variable that designates what the environment is. By default, ABSQL assumes this environment variable is called "ENV".
The below assumes you have Docker installed.
First, git clone
this repository.
Then:
export ABSQL_HOME="~/path/to/this/project"
cd $ABSQL_HOME
make build-image
make run-image
The above will build the development image under the name absql-testing
and run a container called absql-testing
.
From here, you can:
make exec
- Exec into a terminal in the running container.make test
- Runspytest
in a temporary container.make coverage
- Runspytest
and generates a coverage report.make browse-coverage
- Opens up the coverage report in your browser.make stop-container
- Stop the running container.make start-container
- Start a stopped container.
make stop-container # if you have a running container
make remove-container # if you have a stopped container
make build-image
make run-image