/databricks-sql-python

Databricks SQL Connector for Python

Primary LanguagePythonApache License 2.0Apache-2.0

Databricks SQL Connector for Python

PyPI Downloads

The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the Python DB API 2.0 specification.

This connector uses Arrow as the data-exchange format, and supports APIs to directly fetch Arrow tables. Arrow tables are wrapped in the ArrowQueue class to provide a natural API to get several rows at a time.

You are welcome to file an issue here for general use cases. You can also contact Databricks Support here.

Requirements

A development machine running Python >=3.7, <3.10.

Documentation

For the latest documentation, see

Quickstart

Install the library with pip install databricks-sql-connector

Example usage:

from databricks import sql

connection = sql.connect(
  server_hostname='********.databricks.com',
  http_path='/sql/1.0/endpoints/****************',
  access_token='dapi********************************')


cursor = connection.cursor()

cursor.execute('SELECT * FROM RANGE(10)')
result = cursor.fetchall()
for row in result:
  print(row)

cursor.close()
connection.close()

In the above example:

  • server-hostname is the Databricks instance host name.
  • http-path is the HTTP Path either to a Databricks SQL endpoint (e.g. /sql/1.0/endpoints/1234567890abcdef), or to a Databricks Runtime interactive cluster (e.g. /sql/protocolv1/o/1234567890123456/1234-123456-slid123)
  • personal-access-token is the Databricks Personal Access Token for the account that will execute commands and queries

Contributing

See CONTRIBUTING.md

License

Apache License 2.0