bingoohuang/blog

玩转一下TICK技术栈

Opened this issue · 8 comments

TICK Stack

image

This open source core, or TICK Stack, consists of the projects—Telegraf, InfluxDB, Chronograf, and Kapacitor.

image

参考

  1. Intro to TICK Stack Components and InfluxEnterprise | Getting Started [2 of 7]

influxdb

docker pull influxdb
docker run -d  --name=influxdb -p 8086:8086 -v $PWD:/var/lib/influxdb influxdb

Run the influx client in this container:

docker exec -it influxdb influx

Getting started influxdb

一些概念

Data in InfluxDB is organized by “time series”, which contain a measured value, like “cpu_load” or “temperature”. Time series have zero to many points, one for each discrete sample of the metric. Points consist of time (a timestamp), a measurement (“cpu_load”, for example), at least one key-value field (the measured value itself, e.g. “value=0.64”, or “temperature=21.2”), and zero to many key-value tags containing any metadata about the value (e.g. “host=server01”, “region=EMEA”, “dc=Frankfurt”).

Conceptually you can think of a measurement as an SQL table, where the primary index is always time. tags and fields are effectively columns in the table. tags are indexed, and fields are not. The difference is that, with InfluxDB, you can have millions of measurements, you don’t have to define schemas up-front, and null values aren’t stored.

行协议

Points are written to InfluxDB using the Line Protocol, which follows the following format:

<measurement>[,<tag-key>=<tag-value>...] <field-key>=<field-value>[,<field2-key>=<field2-value>...] [unix-nano-timestamp]

The following lines are all examples of points that can be written to InfluxDB:

cpu,host=serverA,region=us_west value=0.64
payment,device=mobile,product=Notepad,method=credit billed=33,licenses=3i 1434067467100293230
stock,symbol=AAPL bid=127.46,ask=127.48
temperature,machine=unit42,type=assembly external=25,internal=37 1434067467000000000

一些cli操作

~/influxdb> docker exec -it influxdb influx
Connected to http://localhost:8086 version 1.7.4
InfluxDB shell version: 1.7.4
Enter an InfluxQL query
> CREATE DATABASE mydb
> show databases
name: databases
name
----
_internal
mydb
> use mydb
Using database mydb
> INSERT cpu,host=serverA,region=us_west value=0.64
> SELECT "host", "region", "value" FROM "cpu"
name: cpu
time                host    region  value
----                ----    ------  -----
1553505460601365400 serverA us_west 0.64
> INSERT temperature,machine=unit42,type=assembly external=25,internal=37
>
> SELECT * FROM "temperature"
name: temperature
time                external internal machine type
----                -------- -------- ------- ----
1553505708014898100 25       37       unit42  assembly
> SELECT * FROM /.*/ LIMIT 1
name: cpu
time                external host    internal machine region  type value
----                -------- ----    -------- ------- ------  ---- -----
1553505460601365400          serverA                  us_west      0.64

name: temperature
time                external host internal machine region type     value
----                -------- ---- -------- ------- ------ ----     -----
1553505708014898100 25            37       unit42         assembly

一些REST操作

~/influxdb> http -vf :8086/query "q=SHOW DATABASES"
POST /query HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 16
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Host: localhost:8086
User-Agent: HTTPie/1.0.2

q=SHOW+DATABASES

HTTP/1.1 200 OK
Content-Encoding: gzip
Content-Type: application/json
Date: Mon, 25 Mar 2019 09:54:13 GMT
Request-Id: f21706c8-4ee3-11e9-8014-0242ac110003
Transfer-Encoding: chunked
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.7.4
X-Request-Id: f21706c8-4ee3-11e9-8014-0242ac110003

{
    "results": [
        {
            "series": [
                {
                    "columns": [
                        "name"
                    ],
                    "name": "databases",
                    "values": [
                        [
                            "_internal"
                        ],
                        [
                            "mydb"
                        ]
                    ]
                }
            ],
            "statement_id": 0
        }
    ]
}

~/influxdb> echo cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000 | http -v ":8086/write?db=mydb"
POST /write?db=mydb HTTP/1.1
Accept: application/json, */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 75
Content-Type: application/json
Host: localhost:8086
User-Agent: HTTPie/1.0.2

cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000

HTTP/1.1 204 No Content
Content-Type: application/json
Date: Tue, 26 Mar 2019 01:35:16 GMT
Request-Id: 6843f186-4f67-11e9-8028-0242ac110003
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.7.4
X-Request-Id: 6843f186-4f67-11e9-8028-0242ac110003

2xx: If your write request received HTTP 204 No Content, it was a success!
注意:这个write的rest api返回204,其实是写成功了

写入多条:

~/influxdb> echo 'cpu_load_short,host=server02 value=0.67
                             cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
                             cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257' |http -v ":8086/write?db=mydb"
POST /write?db=mydb HTTP/1.1
Accept: application/json, */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 202
Content-Type: application/json
Host: localhost:8086
User-Agent: HTTPie/1.0.2

cpu_load_short,host=server02 value=0.67
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257

HTTP/1.1 204 No Content
Content-Type: application/json
Date: Tue, 26 Mar 2019 01:38:31 GMT
Request-Id: dca502cc-4f67-11e9-802d-0242ac110003
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.7.4
X-Request-Id: dca502cc-4f67-11e9-802d-0242ac110003

查询数据:

~/influxdb> http -vf :8086/query 'db=mydb' 'q=select "value" from cpu_load_short'
POST /query HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 48
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Host: localhost:8086
User-Agent: HTTPie/1.0.2

db=mydb&q=select+%22value%22+from+cpu_load_short

HTTP/1.1 200 OK
Content-Encoding: gzip
Content-Type: application/json
Date: Tue, 26 Mar 2019 01:55:44 GMT
Request-Id: 44966917-4f6a-11e9-8033-0242ac110003
Transfer-Encoding: chunked
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.7.4
X-Request-Id: 44966917-4f6a-11e9-8033-0242ac110003

{
    "results": [
        {
            "series": [
                {
                    "columns": [
                        "time",
                        "value"
                    ],
                    "name": "cpu_load_short",
                    "values": [
                        [
                            "2015-01-29T21:55:43.702900257Z",
                            2
                        ],
                        [
                            "2015-01-29T21:55:43.702900257Z",
                            0.55
                        ],
                        [
                            "2015-06-11T20:46:02Z",
                            0.64
                        ],
                        [
                            "2019-03-26T01:38:31.3539026Z",
                            0.67
                        ]
                    ]
                }
            ],
            "statement_id": 0
        }
    ]
}

多条查询:

 ~/influxdb> http -vf :8086/query 'db=mydb' "q=select value from cpu_load_short;SELECT count(value) FROM cpu_load_short WHERE region='us-west'"
POST /query HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 117
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Host: localhost:8086
User-Agent: HTTPie/1.0.2

db=mydb&q=select+value+from+cpu_load_short%3BSELECT+count%28value%29+FROM+cpu_load_short+WHERE+region%3D%27us-west%27

HTTP/1.1 200 OK
Content-Encoding: gzip
Content-Type: application/json
Date: Tue, 26 Mar 2019 01:58:13 GMT
Request-Id: 9d199c91-4f6a-11e9-8034-0242ac110003
Transfer-Encoding: chunked
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.7.4
X-Request-Id: 9d199c91-4f6a-11e9-8034-0242ac110003

{
    "results": [
        {
            "series": [
                {
                    "columns": [
                        "time",
                        "value"
                    ],
                    "name": "cpu_load_short",
                    "values": [
                        [
                            "2015-01-29T21:55:43.702900257Z",
                            2
                        ],
                        [
                            "2015-01-29T21:55:43.702900257Z",
                            0.55
                        ],
                        [
                            "2015-06-11T20:46:02Z",
                            0.64
                        ],
                        [
                            "2019-03-26T01:38:31.3539026Z",
                            0.67
                        ]
                    ]
                }
            ],
            "statement_id": 0
        },
        {
            "series": [
                {
                    "columns": [
                        "time",
                        "count"
                    ],
                    "name": "cpu_load_short",
                    "values": [
                        [
                            "1970-01-01T00:00:00Z",
                            3
                        ]
                    ]
                }
            ],
            "statement_id": 1
        }
    ]
}

InfluxDB的数据模型 tagset data model

image

With the InfluxDB tagset data model, each measurement has a timestamp, and an associated set of tags (tagset) and set of fields (fieldset). The fieldset represents the actual measurement reading values, while the tagset represents the metadata to describe the measurements. Field data types are limited to floats, ints, strings, and booleans, and cannot be changed without rewriting the data. Tagset values are indexed while fieldset values are not. Also, tagset values are always represented as strings, and cannot be updated.

The advantage of this approach is that if one’s data naturally fits the tagset model, then it is quite easy to get started, as one doesn’t have to worry about creating schemas or indexes. Conversely, the disadvantage of this model is that it is quite rigid and limited, with no ability to create additional indexes, indexes on continuous fields (e.g., numerics), update metadata after the fact, enforce data validation, etc. In particular, even though this model may feel “schemaless”, there is actually an underlying schema that is auto-created from the input data, which may differ from the desired schema.

从这篇比较TimescaleDB和InfluxDB的文章来看,InfluxDB参考了Facebook的时序数据库技术论文Gorilla。

Facebook在2015年发表的一篇论文详细介绍了Gorilla的技术细节,这篇论文名为《Gorilla: A Fast, Scalable, In-Memory Time Series DataBase》,此文正是对这篇论文的详细解读。Gorilla的部分设计思路非常值得借鉴,而且已被应用于其它的时序数据库技术中。

参考
Facebook的时序数据库技术

Time Series Admin Administration and querying interface for InfluxDB databases

InfluxDB从1.3版本后,不再提供web admin UI了,还好github有人开源了一款基于Electron的应用, timeseriesadmin

image

让人混淆的duration和shardGroupDuration

InfluxDB在创建database时,会自动创建一个默认的rentention policy(RP),名字叫autogen. autogen的duration是0s(infinite), shardGroupDuration是168h0m0s(7天)。

然后瞬间就懵逼了,duration是永久存储,shardGroupDuration是7天,结合起来,会是个啥情况。

然后,查了一下相关文档,说到如果duration是infinite,数据并不会真正的永久存储,而是会去适配shardGroupDuration。

When you create a database in InfluxDB, you automatically create a default retention policy for that database called autogen. If you choose not to modify the default policy, the value is set to infinite. In this case, the shard group duration will default to 7 days. This means that your data will be stored in 1 week time windows. If your retention policy is on autogen (or infinite), the data is not actually stored infinitely—this just means the retention policy matches the shard group duration, so the retention policy is effectively disabled. On the other hand, the minimum time you can set your retention policy to is one hour.

InfluxDB的关键概念

# # #
数据库(database) 字段键(field key) 字段集(field set)
字段值(field value) 度量(measurement) 点(point)
保留策略(retention policy) 序列(series) 标签键(tag key)
标签集(tag set) 标签值(tag value) 时间戳(timestamp)
🕙[2021-06-29 22:51:22.864] ❯  docker run -p 8086:8086 influxdb:1.7.11-alpine
🕙[2021-06-29 22:51:55.322] ❯ docker ps
CONTAINER ID   IMAGE                    COMMAND                  CREATED         STATUS         PORTS                                       NAMES
1fe87a8b8001   influxdb:1.7.11-alpine   "/entrypoint.sh infl…"   2 minutes ago   Up 2 minutes   0.0.0.0:8086->8086/tcp, :::8086->8086/tcp   quizzical_tesla
🕙[2021-06-29 22:54:06.593] ❯ docker exec -it quizzical_tesla /bin/sh
/ # influx -precision rfc3339
Connected to http://localhost:8086 version 1.7.11
InfluxDB shell version: 1.7.11
> INSERT mymeas value=3 146593455900000000
ERR: {"error":"database is required"}

Note: error may be due to not setting a database or retention policy.
Please set a database with the command "use <database>" or
INSERT INTO <database>.<retention-policy> <point>
> show databases;
name: databases
name
----
_internal
> CREATE DATABASE mydb
> show databases;
name: databases
name
----
_internal
mydb
> use mydb
Using database mydb
> INSERT cpu,host=serverA,region=us_west value=0.64 1434067467000000000
> select * from cpu;
name: cpu
time                 host    region  value
----                 ----    ------  -----
2015-06-12T00:04:27Z serverA us_west 0.64
> INSERT cpu,host=serverA,region=us_west value=0.65 1434067467000000001
> select * from cpu;
name: cpu
time                           host    region  value
----                           ----    ------  -----
2015-06-12T00:04:27Z           serverA us_west 0.64
2015-06-12T00:04:27.000000001Z serverA us_west 0.65
> INSERT cpu,host=serverA,region=us_west value=0.65 1434067467000000002
> select * from cpu;
name: cpu
time                           host    region  value
----                           ----    ------  -----
2015-06-12T00:04:27Z           serverA us_west 0.64
2015-06-12T00:04:27.000000001Z serverA us_west 0.65
2015-06-12T00:04:27.000000002Z serverA us_west 0.65