/clickhouse-jdbc

JDBC driver for ClickHouse

Primary LanguageJavaApache License 2.0Apache-2.0

ClickHouse Java Client & JDBC Driver

GitHub release (latest SemVer including pre-releases) Build Status(https://github.com/ClickHouse/clickhouse-jdbc/workflows/Build/badge.svg) Coverage

Java client and JDBC driver for ClickHouse. Java client is async, lightweight, and low-overhead library for ClickHouse; while JDBC driver is built on top of the Java client with more dependencies and extensions for JDBC-compliance.

Java 8 or higher is required in order to use Java client(clickhouse-client) and/or JDBC driver(clickhouse-jdbc). In addition, starting from 0.3.2, JDBC driver only works with ClickHouse 20.7 or above, so please consider to either downgrade the driver to 0.3.1-patch or upgrade server to one of active releases.


IMPORTANT

Maven groupId ru.yandex.clickhouse and legacy JDBC driver ru.yandex.clickhouse.ClickHouseDriver are deprecated.

Please use new groupId com.clickhouse and driver com.clickhouse.jdbc.ClickHouseDriver instead. It's highly recommended to upgrade to 0.3.2+ and start to integrate the new JDBC driver for improved performance and stability.

image Note: in general, the new driver(v0.3.2) is a few times faster with less memory usage. More information can be found at here.


Features

Category Feature Supported Remark
Protocol HTTP recommended, defaults to java.net.HttpURLConnection and can be changed to java.net.http.HttpClient(faster but less stable)
gRPC still experimental, works with 22.3+, known to has issue when using LZ4 compression
TCP/Native will be available in 0.3.3
Compatibility Server < 20.7 use 0.3.1-patch(or 0.2.6 if you're stuck with JDK 7)
Server >= 20.7 use 0.3.2 or above. All active releases are supported.
Data Type AggregatedFunction limited to groupBitmap
Array(*)
Bool
Date*
DateTime*
Decimal* SET output_format_decimal_trailing_zeros=1 in 21.9+ for consistency
Enum* can be treated as both string and integer
Geo Types Point, Ring, Polygon, and MultiPolygon
Int*, UInt* UInt64 is mapped to long
IPv*
Map(*)
Nested(*)
Object('JSON')
SimpleAggregateFunction
*String
Tuple(*)
UUID
Format RowBinary RowBinaryWithNamesAndTypes for query and RowBinary for insertion
TabSeparated Does not support as many data types as RowBinary

Configuration

  • Client option, server setting, and default value

    You can pass any client option(common, http and grpc) to ClickHouseRequest.option() and server setting to ClickHouseRequest.set() before execution, for instance:

    ClickHouseRequest<?> request = client.connect(myServer);
    request
        .query("select 1")
        // short version of option(ClickHouseClientOption.FORMAT, ClickHouseFormat.RowBinaryWithNamesAndTypes)
        .format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
        .option(ClickHouseClientOption.SOCKET_TIMEOUT, 30000 * 2)
        .set("max_rows_to_read", 100)
        .set("read_overflow_mode", "throw")
        .execute()
        .whenComplete((response, throwable) -> {
            if (throwable != null) {
                log.error("Unexpected error", throwable);
            } else {
                try {
                    for (ClickHouseRecord rec : response.records()) {
                        // ...
                    }
                } finally {
                    response.close();
                }
            }
        });

    Default value can be either configured via system property or environment variable.

  • JDBC configuration

    Driver Class: com.clickhouse.jdbc.ClickHouseDriver

    Note: ru.yandex.clickhouse.ClickHouseDriver and everything under ru.yandex.clickhouse will be removed starting from 0.4.0.

    URL Syntax: jdbc:<prefix>[:<protocol>]://<host>:[<port>][/<database>[?param1=value1&param2=value2]], for examples:

    • jdbc:ch://localhost is same as jdbc:clickhouse:http://localhost:8123
    • jdbc:ch:grpc://localhost is same as jdbc:clickhouse:grpc://localhost:9100
    • jdbc:ch://localhost/test?socket_timeout=120000

    Connection Properties:

    Property Default Description
    continueBatchOnError false Whether to continue batch processing when error occurred
    custom_http_headers comma separated custom http headers, for example: User-Agent=client1,X-Gateway-Id=123
    custom_http_params comma separated custom http query parameters, for example: extremes=0,max_result_rows=100
    jdbcCompliance true Whether to support standard synchronous UPDATE/DELETE and fake transaction
    typeMappings Customize mapping between ClickHouse data type and Java class, which will affect result of both getColumnType() and getObject(Class<?>). For example: UInt128=java.lang.String,UInt256=java.lang.String
    wrapperObject false Whether getObject() should return java.sql.Array / java.sql.Struct for Array / Tuple.

    Note: please refer to JDBC specific configuration and client options(common, http and grpc) for more.

Examples

Java Client

<dependency>
    <groupId>com.clickhouse</groupId>
    <!-- or clickhouse-grpc-client if you prefer gRPC -->
    <artifactId>clickhouse-http-client</artifactId>
    <version>0.3.2-patch9</version>
</dependency>
// only HTTP and gRPC are supported at this point
ClickHouseProtocol preferredProtocol = ClickHouseProtocol.HTTP;
// you'll have to parse response manually if use different format
ClickHouseFormat preferredFormat = ClickHouseFormat.RowBinaryWithNamesAndTypes;

// connect to localhost, use default port of the preferred protocol
ClickHouseNode server = ClickHouseNode.builder().port(preferredProtocol).build();

try (ClickHouseClient client = ClickHouseClient.newInstance(preferredProtocol);
    ClickHouseResponse response = client.connect(server)
        .format(preferredFormat)
        .query("select * from numbers(:limit)")
        .params(1000).executeAndWait()) {
    // or resp.stream() if you prefer stream API
    for (ClickHouseRecord r : response.records()) {
        int num = r.getValue(0).asInteger();
        String str = r.getValue(0).asString();
    }

    ClickHouseResponseSummary summary = response.getSummary();
    long totalRows = summary.getTotalRowsToRead();
}

JDBC Driver

<dependency>
    <!-- will stop using ru.yandex.clickhouse starting from 0.4.0  -->
    <groupId>com.clickhouse</groupId>
    <artifactId>clickhouse-jdbc</artifactId>
    <version>0.3.2-patch9</version>
    <!-- below is only needed when all you want is a shaded jar -->
    <classifier>http</classifier>
    <exclusions>
        <exclusion>
            <groupId>*</groupId>
            <artifactId>*</artifactId>
        </exclusion>
    </exclusions>
</dependency>
String url = "jdbc:ch://localhost/test";
Properties properties = new Properties();
// optionally set connection properties
properties.setProperty("client_name", "Agent #1");
...

ClickHouseDataSource dataSource = new ClickHouseDataSource(url, properties);
try (Connection conn = dataSource.getConnection();
    Statement stmt = conn.createStatement();
    ResultSet rs = stmt.executeQuery("select * from mytable")) {
    ...
}

Build with Maven

Use mvn clean verify to compile, test and generate shaded packages if you're using JDK 8. To create a multi-release jar file(see JEP-238), please use JDK 11 or above and follow instructions below:

  • make sure you have ~/.m2/toolchains.xml, for example:

    <?xml version="1.0" encoding="UTF8"?>
    <toolchains>
        <toolchain>
            <type>jdk</type>
            <provides>
                <version>11</version>
            </provides>
            <configuration>
                <jdkHome>${{ env.JDK11_HOME }}</jdkHome>
            </configuration>
        </toolchain>
    </toolchains>
  • run mvn -Drelease clean install to build and install the artificat to local repository

    Note: if you need to build modules separately, please start with clickhouse-client, followed by clickhouse-http-client and clickhouse-grpc-client, and then clickhouse-jdbc and clickhouse-benchmark.

Benchmark

To benchmark JDBC drivers:

cd clickhouse-benchmark
mvn -Drelease clean package
# single thread mode
java -DdbHost=localhost -jar target/benchmarks.jar -t 1 \
    -p client=clickhouse-jdbc -p connection=reuse \
    -p statement=prepared Query.selectInt8

It's time consuming to run all benchmarks against all drivers using different parameters for comparison. If you just need some numbers to understand performance, please refer to table below and some more details like CPU and memory usage mentioned at here(still have plenty of room to improve according to ranking at here).

Testing

By default, docker container will be created automatically during integration test. You can pass system property like -DclickhouseVersion=21.8 to specify version of ClickHouse.

In the case you prefer to test against an existing server, please follow instructions below:

  • make sure the server can be accessed using default account(user default and no password), which has both DDL and DML privileges
  • add below two configuration files to the existing server and expose all ports for external access
    • ports.xml - enable all ports
    • and users.xml - accounts used for integration test Note: you may need to change root element from clickhouse to yandex when testing old version of ClickHouse.
  • put test.properties under either ~/.clickhouse or src/test/resources of your project, with content like below:
    clickhouseServer=x.x.x.x
    # below properties are only useful for test containers
    #clickhouseVersion=latest
    #clickhouseTimezone=UTC
    #clickhouseImage=clickhouse/clickhouse-server
    #additionalPackages=