Write a tbl_dbi as a table in bigquery
joscani opened this issue · 8 comments
Hi. I new using bigrquery . I'm trying to write my tabla3
( results of join of two tables) in a new table in bigquery.
I'm using src_bigquery to use dplyr syntax.
tabla3
Job complete
Billed: 0 B
Downloading first chunk of data.
First chunk includes all requested rows.
# Source: SQL [?? x 41]
# Database: BigQueryConnection
customerID gender_x SeniorCitizen_x Partner_x Dependents_x tenure_x PhoneService_x MultipleLines_x InternetService_x OnlineSecurity_x
<chr> <chr> <int> <lgl> <lgl> <int> <lgl> <chr> <chr> <chr>
1 9732-OUYRN Female 0 TRUE FALSE 49 TRUE No No No internet ser…
2 0661-KQHNK Female 0 TRUE TRUE 6 TRUE No No No internet ser…
3 4709-LKHYG Female 0 TRUE TRUE 29 TRUE No No No internet ser…
4 9824-QCJPK Male 0 TRUE FALSE 36 TRUE No No No internet ser…
5 4716-MRVEN Female 0 FALSE FALSE 29 TRUE No No No internet ser…
6 8084-OIVBS Female 0 FALSE FALSE 11 TRUE No No No internet ser…
7 4323-ELYYB Male 0 TRUE TRUE 13 TRUE No No No internet ser…
8 0967-BMLBD Female 0 TRUE TRUE 11 TRUE No No No internet ser…
9 1269-FOYWN Male 0 TRUE TRUE 44 TRUE No No No internet ser…
10 3556-BVQGL Female 0 TRUE FALSE 20 TRUE No No No internet ser…
# ℹ more rows
# ℹ 31 more variables: OnlineBackup_x <chr>, DeviceProtection_x <chr>, TechSupport_x <chr>, StreamingTV_x <chr>,
# StreamingMovies_x <chr>, Contract_x <chr>, PaperlessBilling_x <lgl>, PaymentMethod_x <chr>, MonthlyCharges_x <dbl>,
# TotalCharges_x <chr>, Churn_x <lgl>, gender_y <chr>, SeniorCitizen_y <int>, Partner_y <lgl>, Dependents_y <lgl>, tenure_y <int>,
# PhoneService_y <lgl>, MultipleLines_y <chr>, InternetService_y <chr>, OnlineSecurity_y <chr>, OnlineBackup_y <chr>,
# DeviceProtection_y <chr>, TechSupport_y <chr>, StreamingTV_y <chr>, StreamingMovies_y <chr>, Contract_y <chr>,
# PaperlessBilling_y <lgl>, PaymentMethod_y <chr>, MonthlyCharges_y <dbl>, TotalCharges_y <chr>, Churn_y <lgl>
# ℹ Use `print(n = ...)` to see more rows
Tabla3 has class
class(tabla3)
[1] "tbl_BigQueryConnection" "tbl_dbi" "tbl_sql" "tbl_lazy" "tbl"
How do I convert tabla3 to a bq_table and later save in a new bigquery table ???
Thanks
Maybe sometthing like that.
write_table_bq <- function(table_to_write, project,dataset, table_new){
query <- stringr::str_glue("CREATE TABLE `{project}.{dataset}.{table_new}` AS {dbplyr::sql_render(table_to_write) %>% as.character()}")
bigrquery::bq_project_query(project, query)
}
You should be able to use compute(tbl, "name", temporary = FALSE)
.
Thanks @hadley by answer me and for your bigrquery package. bigrquery will help me in my job the next months.
compute(tbl, "name", temporary = FALSE)
not work , (using name like myproject.dataset.nametable, neither with dataset.nametable or only nametable.
For write tables I use the function in my previous coment. write_table_bq
And this morning I'm working in a simple way to write table in temporary dataset and then bring the refereence again to use in bigrquery and use dplyr syntax.
What do you think about this idea?
get_bq_table <- function(table, project,dataset){
query <- stringr::str_glue("{dbplyr::sql_render(table) %>% as.character()}")
tb_bq <- bigrquery::bq_project_query(project, query)
}
get_lazy_from_bq <- function(table, con){
meta <- bigrquery::bq_table_meta(table)
tb_name <- paste0(meta$tableReference$datasetId, ".", meta$tableReference$tableId)
con %>% tbl(I(tb_name))
}
bq_compute <- function(table) {
atrib_con <- attributes(table$src$con)
project <- atrib_con$project
dataset <- atrib_con$dataset
con <- table$src
temp_table <- get_bq_table(table, project, dataset)
ref_table <- get_lazy_from_bq(temp_table, con)
}
And I can do things like that
tabla4 <- tabla1 %>%
inner_join(tabla2, by = "customerID") %>%
bq_compute()
If compute()
doesn't work, it should be made to work, because that's the official dbplyr syntax.
Uhm, I'm just update my packages, and compute works perfectly. Thanks @hadley . I close the issue with this comment
The problem was if I have a connection without dataset .
sc <- src_bigquery("my_project", dataset = NULL, max_pages = 10)
tabla1 <- sc %>% tbl(I("churn.tabla1"))
tabla2 <- sc %>% tbl(I("churn.tabla2"))
# This works
tabla3 <- tabla1 %>%
inner_join(tabla2, by = "customerID") %>%
compute()
# This not works.
tabla3 %>%
compute("churn.tabla3.", temporary = FALSE)
But if I use src_bigquery with dataset, all works fine
sc <- src_bigquery("my_project", dataset = "churn", max_pages = 10)
Shouldn't that be tabla3 %>% compute(I("churn.tabla3"), temporary = FALSE)
? If that doesn't work, I'd say it's a bug.
Ups.. You're right. The I
is neccesary
tabla3 %>% compute(I("churn.test"), temporary = FALSE)
load_tabla3 <- sc %>% tbl(I("churn.test"))
load_tabla3 %>% head(6)
# Source: SQL [6 x 41]
# Database: BigQueryConnection
customerID gender_x SeniorCitizen_x Partner_x Dependents_x tenure_x
<chr> <chr> <int> <lgl> <lgl> <int>
1 8883-GRDWQ Male 1 FALSE FALSE 20
2 1088-CNNKB Male 0 TRUE FALSE 63
3 2282-YGNOR Female 0 FALSE FALSE 29
4 2361-FJWNO Male 0 FALSE FALSE 40
5 3605-JISKB Male 1 TRUE FALSE 55
6 4003-OCTMP Female 0 TRUE FALSE 31
Thanks a lot. In next months I'll use this package in my job..