/dashbase-idl

Dashbase IDL definitions

Primary LanguageJava

dashbase-idl

Dashbase IDL definitions

Avro

For storing and transmitting structured data in a compact and language-indepedent way, we've created Avro schemas here

Currently we only have Avro support due to native-support by both the Hadoop and Kafka projects.

DashbaseEvent Definition

A DashbaseEvent defines a record to be inserted into Dashbase, and is composed from 3 parts:

Time

Timestamp in milliseconds of creation of the event, defaults to 0 if not specified.

Columns

Dashbase columns define how the record is to be stored.

  • meta columns - structured data, will not be tokenized and support aggregations, e.g. topn. Examples are: host, response code etc.
  • number columns - contains numeric data, will be indexed as numbers and support numeric aggregations, e.g. sum/min/max/avg. Examples are: latency, byte count etc.
  • text columns - unstructured text, will be tokenized and support full-text queries. Examples are: log messages, agents etc.
  • id columns - optimized for optional id information, similar to meta, will not be tokenized. Aggreagtions are not supported.

Payload

Raw data and its storage can be configurated via:

  • omitPayload - if true, raw data storage is skipped. Examples would be metrics or click data, where raw event bytes are typically not used, and storing them would be wasteful.

Java API

DashbaseEventBuilder should be used to build a DashbaseEvent instance.

Example:

DashbaseEventBuilder eventBuilder =
                new DashbaseEventBuilder()
                .withOmitPayload(false)
                .withTimeInMillis(System.currentTimeMillis())
                .addMeta("tags", "green")
                .addNumber("num", 1234.0)
                .addText("text", "dashbase is cool");

DashbaseEvent event = eventBuilder.build();