/dremio_client

Un-official Python client for Dremio

Primary LanguagePythonApache License 2.0Apache-2.0

Dremio client

Documentation Status Updates Codacy Codecov

The un-official python client for Dremio's REST API. This enables both administrators and data scientists to get the most out of Dremio in Python

Features

  • Cross platform support

  • All Pythons between 2.7 - 3.7 supported

  • Full support for Dremio's REST API

  • Optional Support for Dremio's ODBC or experimental Arrow Flight capabilities

  • Rich config file support via confuse yaml config library. Simple to create a client (config stored in a yaml file)

    from dremio_client import init
    client = init() # initialise connectivity to Dremio via config file
    catalog = client.data # fetch catalog
    vds = catalog.space.vds.get() # fetch a specific dataset
    df = vds.query() # query the first 1000 rows of the dataset and return as a DataFrame
    pds = catalog.source.pds.get() # fetch a physical dataset
    pds.metadata_refresh() # refresh metadata on that dataset
  • CLI interface for integration with scripts

    $ dremio_client query --sql 'select * from sys.options'
    {'results':results}
    $ dremio_client refresh-metadata --table 'my.vds.name'
    {'result':'ok'}
  • Catalog autocompletion in Jupyter Notebooks

https://raw.github.com/rymurr/dremio_client/master/docs/images/autocomplete.gif

Status

This is still alpha software and is relatively incomplete. Contributions in the form of Github Issues or Pull requests are welcome. See CONTRIBUTING

TODO

  • put, delete and post
  • add reflections into catalog
  • test for larger spaces and ensure it doesnt fetch everytihgn if we hit directly
  • search
  • docs
  • logging
  • handle exceptions correctly in cli
  • osx & appveyor
  • testing