akumuli/Akumuli

sync akumuli data to other datasource like hadoop

qinhongsheng opened this issue · 2 comments

the time series data in akumuli need to achieve to secondary storage like Hadoop, how can get the history data by time range and write to hdfs quickly.

Lazin commented

Hi,

There is more than one way to do this.

You can query Akumuli from the command line and generate a compressed CSV file like that:

curl -s --url localhost:8181/api/query -d '{ "join": ["cpu.user","cpu.sys","cpu.real","idle","mem.commit","mem.virt","iops","tcp.packets.in","tcp.packets.out","tcp.ret"], "range": { "from": "20190101T000000", "to": "20190201T000000" }, "output": { "format": "csv" }}' | gzip > 20190101-20190201.csv.gz

You can order by series or by timestamp. Also, you can have up to 32 columns in join query. Unfortunately, you have to specify metric names in the query. It's not yet possible to read all series at once.

Another option is to read data using simple select.

curl -s --url localhost:8181/api/query -d '{ "select": "cpu.user", "range": { "from": "20190101T000000", "to": "20190201T000000" }, "output": { "format": "csv" }}' | gzip > cpu.user.20190101-20190201.csv.gz

To get the list of series names you can use this query:

curl -s --url localhost:8181/api/query -d '{ "select": "meta:names" }'

After that you can run select query for each one of them.

Depending on what you need you can query downsampled or filtered data.

thanks your prompt response