The SeqsLab Connector for Python based on pyhive allows you to create a Python DB API connection to Atgenomix SeqsLab interactive jobs (clusters) and develop Python-based workflow applications. It is a Hive-Thrift-based client with no dependencies on ODBC or JDBC. It also provides a SQLAlchemy dialect and an Apache Superset database engine spec for use with tools to execute DQL.
You are welcome to file an issue for general use cases. You can also contact Atgenomix Support here.
Python 3.7 or above is required.
Install using pip.
pip install seqslab-connector
For Apache Superset integration install with
pip install seqslab-connector[superset]
from seqslab import hive
conn = hive.connect(database='run_name', http_path='job_run_id', username='user', password='pass', host='job_cluster_host')
cursor = conn.cursor()
cursor.execute('SHOW TABLES')
print(cursor.fetchall())
cursor.execute('SELECT * FROM my_workflow_table_name LIMIT 10')
print(cursor.fetchall())
cursor.close()
from sqlalchemy.engine import create_engine
engine = create_engine('seqslab+hive://user:pass@job_cluster_host/run_name?http_path=job_run_id')
For the latest documentation, see SeqsLab.