IMSMWU/RClickhouse

dbWriteTable error when there are spaces (or other special characters) in column names

Closed this issue ยท 6 comments

Hi

I cannot write tables with spaces (or any other special character) in column names.
The R code below shows an example.

library(RClickhouse)
library(DBI)

con <- dbConnect(clickhouse(), port=9000)

test1 <- data.frame(
   "Col1"=c("b","b"),
   "Col2_no_space"=1:2,
   stringsAsFactors=FALSE,
   check.names=FALSE
)
dbWriteTable(con, "test1", test1)
dbListTables(con)
#> [1] "test1"

test2 <- data.frame(
   "Col1"=c("b","b"),
   "Col2 with space"=1:2,
   stringsAsFactors=FALSE,
   check.names=FALSE
)
dbWriteTable(con, "test2", test2)
#> Error in select(conn@ptr, statement): DB::Exception: Syntax error: failed at position 46: space Int32) ENGINE=TinyLog. Expected one of: CODEC, ALIAS, TTL, ClosingRoundBracket, Comma, DEFAULT, MATERIALIZED, COMMENT, token
dbListTables(con)
#> [1] "test1"


sessionInfo()
#> R version 3.6.3 (2020-02-29)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 18.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] DBI_1.1.0         RClickhouse_0.5.2
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.3       knitr_1.28       magrittr_1.5     tidyselect_1.0.0
#>  [5] bit_1.1-15.2     R6_2.4.1         rlang_0.4.5      stringr_1.4.0   
#>  [9] highr_0.8        dplyr_0.8.5      tools_3.6.3      xfun_0.12       
#> [13] dbplyr_1.4.2     htmltools_0.4.0  yaml_2.2.1       bit64_0.9-7     
#> [17] digest_0.6.25    assertthat_0.2.1 tibble_2.1.3     crayon_1.3.4    
#> [21] purrr_0.3.3      glue_1.3.1       evaluate_0.14    rmarkdown_2.1   
#> [25] stringi_1.4.6    compiler_3.6.3   pillar_1.4.3     pkgconfig_2.0.3

And here is information about ClickHouse version:

docker run -d \
	--name ch_test \
	--ulimit nofile=262144:262144 \
	--publish=8123:8123 --publish=9000:9000 \
	--restart=always \
	yandex/clickhouse-server:20.1.6.3

Do you have an idea how I could solve this issue?

Thank you for your help and for this very useful package.

Best regards

Patrice

@patzaw thanks for reporting this issue. we'll look into it and update you as soon as possible. If we do not find a workaround we'll close it in the next release.

Hi

Thanks for your answer.
I have a workaround with version 0.4.0 of RClickhouse that could maybe help you debbugging this issue.

The following workaround works with version 0.4.0...

packageDescription("RClickhouse")$Version
#> [1] "0.4.0"
library(DBI)
library(RClickhouse)

con <- dbConnect(clickhouse(), port=9000)
dbListTables(con)
#> character(0)

test <- data.frame(
   "Col1"=c("b","b"),
   "Col2 with space"=1:2,
   stringsAsFactors=FALSE,
   check.names=FALSE
)

dbSendQuery(con, "CREATE TABLE Test (`Col1` String, `Col2 with space` Float64) ENGINE TinyLog")
#> <ClickhouseResult>
#>   SQL  CREATE TABLE Test (`Col1` String, `Col2 with space` Float64) ENGINE TinyLog
#>   ROWS Fetched: 0 [complete]
#>        Changed: 0
dbListTables(con)
#> [1] "Test"
dbWriteTable(con, "Test", test, append=TRUE)
dbReadTable(con, "Test")
#>   Col1 Col2 with space
#> 1    b               1
#> 2    b               2

... but not with version 0.5.2

packageDescription("RClickhouse")$Version
#> [1] "0.5.2"
library(DBI)
library(RClickhouse)

con <- dbConnect(clickhouse(), port=9000)
dbListTables(con)
#> character(0)

test <- data.frame(
   "Col1"=c("b","b"),
   "Col2 with space"=1:2,
   stringsAsFactors=FALSE,
   check.names=FALSE
)

dbSendQuery(con, "CREATE TABLE Test (`Col1` String, `Col2 with space` Float64) ENGINE TinyLog")
#> <ClickhouseResult>
#>   SQL  CREATE TABLE Test (`Col1` String, `Col2 with space` Float64) ENGINE TinyLog
#>   ROWS Fetched: 0 [complete]
#>        Changed: 0
dbListTables(con)
#> [1] "Test"
dbWriteTable(con, "Test", test, append=TRUE)
#> Error in insert(conn@ptr, qname, value): DB::Exception: Syntax error: failed at position 32: with space ) VALUES. Expected one of: ClosingRoundBracket, Comma, Dot, token
dbReadTable(con, "Test")
#> [1] Col1            Col2 with space
#> <0 rows> (or 0-length row.names)

The rest of sessionInfo is the same than the one in my first post.

I hope it helps

Thanks again

Patrice

@patzaw Thank you very much for the follow-up and your suggestion. I will look into it.

Looks like it is a clickhouse-cpp problem: ClickHouse/clickhouse-cpp#30

#66 fixes #58

Creating columns that have spaces in their identifier is now possible. No quoting is required when using one of the built-in methods such as dbWriteTable. Nevertheless, RCH also accepts the same quotes for identifiers as Clickhouse does, namely backticks or double-quotes.

It's working! Many thanks @tridelt !