玩转一下TimescaleDB
Opened this issue · 0 comments
bingoohuang commented
玩转一下TimescaleDB
介绍
TimescaleDB是基于PostgreSQL数据库打造的一款时序数据库,插件化的形式,随着PostgreSQL的版本升级而升级,不会因为另立分支带来麻烦。
TimescaleDB实际上是PostgreSQL的一个插件,目前支持PostgreSQL 9和10两个版本。它把时间序列数据保存到“超级表”(TimescaleDB hypertables)中,超级表就像一张普通的数据库表一样,用户可以添加索引,执行连接,使用窗口函数等SQL功能。当然,TimescaleDB还额外提供了SQL的扩展功能,并且优化了数据库的查询计划,是之更加适用于时序数据。
超级表的底层实现,简单来说可以理解成按照时间和属性(time-space)把数据自动分区到不同的chunk上,这个过程有点像分表,但超级表的“分表”是按照事件和数据属性按需自动进行的,并且chunk大小也是自动调整的。
TimescaleDB架构:数据自动按时间和空间分片(chunk)。
安装
docker run -d --name timescaledb -p 5432:5432 -e POSTGRES_PASSWORD=password timescale/timescaledb:latest-pg9.6
# connect to PostgreSQL client
docker exec -it timescaledb psql -U postgres
-- Create the database, let's call it 'tutorial'
CREATE database tutorial;
-- Connect to the database
\c tutorial
-- Extend the database with TimescaleDB
CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;
-- Connecting to the new database is as simple as:
-- psql -U postgres -h localhost -d tutorial
-- We start by creating a regular SQL table
CREATE TABLE conditions (
time TIMESTAMPTZ NOT NULL,
location TEXT NOT NULL,
temperature DOUBLE PRECISION NULL,
humidity DOUBLE PRECISION NULL
);
tutorial=# \dt
List of relations
Schema | Name | Type | Owner
--------+------------+-------+----------
public | conditions | table | postgres
tutorial=# SELECT * FROM pg_catalog.pg_tables where tablename = 'conditions';
schemaname | tablename | tableowner | tablespace | hasindexes | hasrules | hastriggers | rowsecurity
------------+------------+------------+------------+------------+----------+-------------+-------------
public | conditions | postgres | | f | f | f | f
(1 row)
-- transform it into a hypertable with create_hypertable:
-- This creates a hypertable that is partitioned by time
-- using the values in the `time` column.
SELECT create_hypertable('conditions', 'time');
-- This creates a hypertable partitioned on both time and `location`. In this example,
-- the hypertable will partition `location` into 4 partitions.
SELECT create_hypertable('conditions', 'time', 'location', 4);
INSERT INTO conditions(time, location, temperature, humidity)
VALUES (NOW(), 'office', 70.0, 50.0);
SELECT * FROM conditions ORDER BY time DESC LIMIT 100;
INSERT INTO conditions
VALUES
(NOW(), 'office', 70.0, 50.0),
(NOW(), 'basement', 66.5, 60.0),
(NOW(), 'garage', 77.0, 65.2);
-- specify that INSERT returns some or all of the inserted data via the RETURNING statement:
INSERT INTO conditions
VALUES (NOW(), 'office', 70.1, 50.1) RETURNING *;
UPDATE conditions SET temperature = 70.2, humidity = 50.0
WHERE time = '2017-07-28 11:42:42.846621+00' AND location = 'office';
UPDATE conditions SET temperature = temperature + 0.1
WHERE time >= '2017-07-28 11:40' AND time < '2017-07-28 11:50';