/analyst

A declarative, SQL-like DSL for data integration tasks.

Primary LanguageGoGNU General Public License v3.0GPL-3.0

Analyst

Go Report Card Build Status

Purpose

Analyst is a tool to validate and run Analyst Query Language (AQL) scripts. AQL is an ETL configuration language for developers that aims to be:

  • Declarative: the developer defines the components, how they depend on one another, and any additional synchronization (i.e. AFTER); the runtime figures out the DAG and executes it
  • Intuitive: similar syntax to SQL, but any options for external programs such as MS Excel use native conventions such as Excel Ranges
  • Maintainable: support large jobs and code reuse through language features like INCLUDE and EXTERN
  • Extensible: use stdin/stdout protocol and pipes to write ETL logic in any language. Native support for Python and Javascript.
  • Stateful: Components can persist state in an SQLite3 database unique to each job run (GLOBAL source/destination).

It has connectors to:

  • MS SQL Server (source/destination)
  • Postgres (source/destination)
  • SQLite3 (source/destination)
  • Mandrill transactional email API (destination)
  • Web APIs (source)
  • Slack (for logging only)
  • Flat file (source)
  • Console (destination)
  • Built-in in-memory SQLite3 database (source/destination)
  • JSON-RPC plugins (source/destination)

Getting Started

  1. Grab the latest binary from the releases tab and place it on your PATH.
  2. Create and save an AQL script.
  3. Run analyst run --script <path-to-your-script>.

For a "hello world" example, try

DATA 'MyMessage' (
	[
	  ["Hello, World"]
	]
) INTO CONSOLE WITH (COLUMNS = 'Message')

Documentation

Docs are on Github pages here.

Table of Contents

  1. Get Started
  2. Recipes
  3. Blocks

Contributing

All contributions are welcome:

  • To report any bugs or feature suggestion, please open an issue
  • If you wish to fix a minor bug or issue, please open a PR directly
  • For enhancements, refactoring, or major issues, please open an issue before opening a PR

License

All source code and artifacts are released under GNU General Public License v3.0, as detailed in LICENSE.md. This means, among other things, that if you wish to use any part of the code or artifacts in your own project, it must be distributed under the same license.

If this not suitable for your use case please get in touch by opening an issue or Twitter @MikeBrno.