vahid-sohrabloo/chconn

Support of nested arrays?

bgranvea opened this issue · 3 comments

I'm interested by your driver to improve insert performance (I'm currently using clickhouse-go) but I have some columns with type Array(Nullable(UInt32)) and Array(Array(Nullable(UInt32)).

Are these types supported?

Thanks.

Yes. It's possible to have any nested data. for your example

package main

import (
	"context"
	"os"

	"github.com/vahid-sohrabloo/chconn/chpool"
	"github.com/vahid-sohrabloo/chconn/column"
)

func main() {
	connString := os.Getenv("DATABAE_URI")

	conn, err := chpool.Connect(context.Background(), connString)
	if err != nil {
		panic(err)
	}
	_, err = conn.Exec(context.Background(), `DROP TABLE IF EXISTS test_nested_array`)
	if err != nil {
		panic(err)
	}

	_, err = conn.Exec(context.Background(), `CREATE TABLE test_nested_array (
				col1 Array(Nullable(UInt32)),
				col2  Array(Array(Nullable(UInt32)))
			) Engine=Memory`)
	if err != nil {
		panic(err)
	}
	// we set first arg true for nullable column
	col1 := column.NewUint32(true)
	// we need another column for writing array and pass col1 as child
	col1Array := column.NewArray(col1)

	// now we can insert to  this column
	ourData := []uint32{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
	// first we have to insert len of array to col1Array and then append main data to col1
	col1Array.AppendLen(len(ourData))
	for _, v := range ourData {
		col1.Append(v)
		// I assume All data is not null
		col1.AppendIsNil(false)
	}

	// for nested array we need more columns
	// we set first arg true for nullable column
	col2 := column.NewUint32(true)
	// we need another column for writing array and pass col2 as child
	col2Array := column.NewArray(col2)
	// we need another column for writing array and pass col2Array as child
	col2NestedArray := column.NewArray(col2Array)

	// now we can insert to  this column
	ourNestedData := [][]uint32{
		{1, 2, 3, 4, 5, 6, 7, 8, 9, 10},
		{11, 12, 13, 14, 15, 16, 17, 18, 19, 20},
	}
	col2NestedArray.AppendLen(len(ourNestedData))
	for _, d := range ourNestedData {
		col2Array.AppendLen(len(d))
		for _, v := range d {
			col2.Append(v)
			// I assume All data is not null
			col2.AppendIsNil(false)
		}
	}

	err = conn.Insert(context.Background(), `INSERT INTO test_nested_array (col1, col2) VALUES`, col1Array, col2NestedArray)
	if err != nil {
		panic(err)
	}
}

for nullable data, there is more way to insert null like using AppendP for pointer data.
Feel free to ask any other questions or report any problems.

Please pay attention, In ClickHouse it's better to batch insert. you can insert any number of data with Append functions. and insert it with one Insert function.
also for better performance, it's better to use the column object again (you have to call the Reset() function after insert.)

thanks for the detailed response! I've done some basic tests, now I'm trying to integrate this in our code to see if performance are better.