plutov/clickhouse-helm

[Support] Database does not exist

Closed this issue · 7 comments

This project is absolutely awesome and I was able to start working with ClickHouse on Kubernetes right away. My configuration is slightly different than the one you provide. In fact I do have:

  • 5 ClickHouse replicas
  • 3 Zookeeper replicas
  • Tabix + LB
  • ClickHouse LB

Everything works smoothly and I am able to connect to Tabix. The issue comes when I create a new database with CREATE DATABASE test; and then I click Reload Structure. Sometimes the database appears to the left among the other ones and sometimes it does not. The issue is when I try to create a new Table or I try to query. Very often it appears Code: 81, e.displayText() = DB::Exception: Database test doesn't exist while I created few moments before that.

To enforce my case, with the clickhouse-client happens as well. I was wondering if I am missing some configurations (something related to loadbalancing?) or what i am doing wrong.

Thank you in advance.

Hi @spaghettifunk
What type of the table are you using? If it's a replicated table, then CREATE query is executed on a single server and is not replicated - https://clickhouse.yandex/docs/en/operations/table_engines/replication/
Try to create table on all nodes.

Thanks for the fast reply. I do try to create the Table in replicated manner. The problem is that the query doesn't go through because it returns the exception I put above.

Interesting, I haven't tried creating a new DB in the cluster, let me try it and get back to you in few days.

Great! I am willing to help you a lot on this because I want to make it production ready but I need to solve these little issues I'm having. Also, It would be cool to make it as an official Chart and submit a PR to the official repository :)

Hi @spaghettifunk,
I tested on a clean cluster :) here is my log:

In ClickHouse CREATE statements are not replicated, so you have to create databases, tables manually on all nodes.

Create database on ALL nodes:

clickhouse-0.clickhouse.default.svc.cluster.local :) CREATE DATABASE test;

CREATE DATABASE test

Ok.

0 rows in set. Elapsed: 0.199 sec. 

clickhouse-0.clickhouse.default.svc.cluster.local :) show databases;

SHOW DATABASES

┌─name────┐
│ default │
│ system  │
│ test    │
└─────────┘

3 rows in set. Elapsed: 0.068 sec. 

clickhouse-1.clickhouse.default.svc.cluster.local :) show databases;

SHOW DATABASES

┌─name────┐
│ default │
│ system  │
└─────────┘

2 rows in set. Elapsed: 0.086 sec.

clickhouse-1.clickhouse.default.svc.cluster.local :) CREATE DATABASE test

CREATE DATABASE test

Ok.

0 rows in set. Elapsed: 0.054 sec. 

clickhouse-1.clickhouse.default.svc.cluster.local :) show databases;

SHOW DATABASES

┌─name────┐
│ default │
│ system  │
│ test    │
└─────────┘

3 rows in set. Elapsed: 0.043 sec.

Run these queries on ALL nodes:

USE test;

CREATE TABLE IF NOT EXISTS clicks_sharded (
    date Date DEFAULT toDate(request_time),
    request_time DateTime DEFAULT now(),
    id String,
    advertiser String,
    network String,
    publisher String,
    offer String,
    campaign String,
    sub_id String,
    click_ip String,
    referer String,
    user_agent String,
    device_id String,
    country String,
    city String,
    browser_language String,
    campaign_type String,
    revenue String,
    payout String
) ENGINE=ReplicatedMergeTree('/clickhouse/tables/{shard}/test/clicks_sharded', '{replica}', date, (date, id), 8192);

CREATE TABLE IF NOT EXISTS clicks AS clicks_sharded
ENGINE = Distributed( clicks_cluster, test, clicks_sharded , rand() );
  1. Check inserts

a. Insert on node 0.

USE test;

INSERT INTO clicks (id) VALUES ('1');

INSERT INTO clicks (id) VALUES

Ok.

1 rows in set. Elapsed: 0.091 sec.

b. Select from another node (1).

USE test;

clickhouse-1.clickhouse.default.svc.cluster.local :) SELECT id FROM clicks;

SELECT id
FROM clicks 

┌─id─┐
│ 1  │
└────┘

1 rows in set. Elapsed: 0.073 sec.

Hi @plutov! thanks for the effort! I think I am still a noob on ClickHouse I thought the problem was with the deployment while in reality was with the way I create stuff. Yesterday evening I realised that I could use the ON CLUSTER clause to dispatch the creation of the database and tables on all nodes as long as I use the ReplicatedMergeTree Engine.

Regardless of this issue, what do you think if I make a propose for an official Helm Chart based on your work?

I was thinking about it previously, and also want to make it as official chart. One problem is that I am using custom Dockerfile, since I do a custom run.sh there. It will be nice if official chart works with official ClickHouse image only.