Trino (f.k.a PrestoSQL) dialect for SQLAlchemy.
The primary purpose of this is provide a dialect for Trino that can be used with Apache Superset. But other use-cases should works as well.
Trino version 352 and higher
The driver can either be installed through PyPi or from the source code.
pip install sqlalchemy-trino
pip install git+https://github.com/dungdm93/sqlalchemy-trino
To connect from SQLAlchemy to Trino, use connection string (URL) following this pattern:
trino://<username>:<password>@<host>:<port>/catalog/[schema]
You can pass the JWT token via either connect_args
or the query string
parameter accessToken
:
from sqlalchemy.engine import create_engine
from trino.auth import JWTAuthentication
# pass access token via connect_args
engine = create_engine(
'trino://<username>@<host>:<port>/',
connect_args={'auth': JWTAuthentication('a-jwt-token')},
)
# pass access token via the query string param accessToken
engine = create_engine(
'trino://<username>@<host>:<port>/?accessToken=a-jwt-token',
)
Notice: When using username and password, it will connect to Trino over TLS connection automatically.
It supports user impersonation with username and password based authentication only.
You can pass the session user (a.k.a., the user that will be impersonated) via
either connect_args
or the query string parameter sessionUser
:
from sqlalchemy.engine import create_engine
# pass session user via connect_args
engine = create_engine(
'trino://<username>:<password>@<host>:<port>/',
connect_args={'user': 'user-to-be-impersonated'},
)
# pass session user via a query string parameter
engine = create_engine(
'trino://<username>:<password>@<host>:<port>/?sessionUser=user-to-be-impersonated',
)
import pandas as pd
from pandas import DataFrame
import sqlalchemy_trino
from sqlalchemy.engine import Engine, Connection
def trino_pandas_write(engine: Engine):
df: DataFrame = pd.read_csv("tests/data/population.csv")
df.to_sql(con=engine, schema="default", name="abcxyz", method="multi", index=False)
print(df)
def trino_pandas_read(engine: Engine):
connection: Connection = engine.connect()
df = pd.read_sql("SELECT * FROM public.foobar", connection)
print(df)
Note: in df.to_sql
following params is required:
index=False
because index is not supported in Trino.method="multi"
: currentlymethod=None
(default) is not working because Trino dbapi is not supportexecutemany
yet