r-dbi/bigrquery

Write a tbl_dbi as a table in bigquery

joscani opened this issue · 8 comments

Hi. I new using bigrquery . I'm trying to write my tabla3 ( results of join of two tables) in a new table in bigquery.
I'm using src_bigquery to use dplyr syntax.

 tabla3
Job complete
Billed: 0 B
Downloading first chunk of data.
First chunk includes all requested rows.
# Source:   SQL [?? x 41]
# Database: BigQueryConnection
   customerID gender_x SeniorCitizen_x Partner_x Dependents_x tenure_x PhoneService_x MultipleLines_x InternetService_x OnlineSecurity_x
   <chr>      <chr>              <int> <lgl>     <lgl>           <int> <lgl>          <chr>           <chr>             <chr>           
 1 9732-OUYRN Female                 0 TRUE      FALSE              49 TRUE           No              No                No internet ser2 0661-KQHNK Female                 0 TRUE      TRUE                6 TRUE           No              No                No internet ser3 4709-LKHYG Female                 0 TRUE      TRUE               29 TRUE           No              No                No internet ser4 9824-QCJPK Male                   0 TRUE      FALSE              36 TRUE           No              No                No internet ser5 4716-MRVEN Female                 0 FALSE     FALSE              29 TRUE           No              No                No internet ser6 8084-OIVBS Female                 0 FALSE     FALSE              11 TRUE           No              No                No internet ser7 4323-ELYYB Male                   0 TRUE      TRUE               13 TRUE           No              No                No internet ser8 0967-BMLBD Female                 0 TRUE      TRUE               11 TRUE           No              No                No internet ser9 1269-FOYWN Male                   0 TRUE      TRUE               44 TRUE           No              No                No internet ser10 3556-BVQGL Female                 0 TRUE      FALSE              20 TRUE           No              No                No internet ser# ℹ more rows
# ℹ 31 more variables: OnlineBackup_x <chr>, DeviceProtection_x <chr>, TechSupport_x <chr>, StreamingTV_x <chr>,
#   StreamingMovies_x <chr>, Contract_x <chr>, PaperlessBilling_x <lgl>, PaymentMethod_x <chr>, MonthlyCharges_x <dbl>,
#   TotalCharges_x <chr>, Churn_x <lgl>, gender_y <chr>, SeniorCitizen_y <int>, Partner_y <lgl>, Dependents_y <lgl>, tenure_y <int>,
#   PhoneService_y <lgl>, MultipleLines_y <chr>, InternetService_y <chr>, OnlineSecurity_y <chr>, OnlineBackup_y <chr>,
#   DeviceProtection_y <chr>, TechSupport_y <chr>, StreamingTV_y <chr>, StreamingMovies_y <chr>, Contract_y <chr>,
#   PaperlessBilling_y <lgl>, PaymentMethod_y <chr>, MonthlyCharges_y <dbl>, TotalCharges_y <chr>, Churn_y <lgl>
# ℹ Use `print(n = ...)` to see more rows

Tabla3 has class

class(tabla3)
[1] "tbl_BigQueryConnection" "tbl_dbi"                "tbl_sql"                "tbl_lazy"               "tbl"                   

How do I convert tabla3 to a bq_table and later save in a new bigquery table ???
Thanks

Maybe sometthing like that.

write_table_bq  <-  function(table_to_write, project,dataset, table_new){

  query  <- stringr::str_glue("CREATE TABLE `{project}.{dataset}.{table_new}` AS {dbplyr::sql_render(table_to_write) %>% as.character()}")

  bigrquery::bq_project_query(project, query)
}
hadley commented

You should be able to use compute(tbl, "name", temporary = FALSE).

Thanks @hadley by answer me and for your bigrquery package. bigrquery will help me in my job the next months.

compute(tbl, "name", temporary = FALSE) not work , (using name like myproject.dataset.nametable, neither with dataset.nametable or only nametable.

For write tables I use the function in my previous coment. write_table_bq
And this morning I'm working in a simple way to write table in temporary dataset and then bring the refereence again to use in bigrquery and use dplyr syntax.
What do you think about this idea?

get_bq_table <-  function(table, project,dataset){
  query  <- stringr::str_glue("{dbplyr::sql_render(table) %>% as.character()}")
 tb_bq  <- bigrquery::bq_project_query(project, query)
}


get_lazy_from_bq <- function(table, con){
    meta <- bigrquery::bq_table_meta(table)
    tb_name <- paste0(meta$tableReference$datasetId, ".", meta$tableReference$tableId)
    con  %>% tbl(I(tb_name))
  }

bq_compute <- function(table) {

  atrib_con  <-  attributes(table$src$con)
  project  <- atrib_con$project
  dataset  <- atrib_con$dataset
  con <- table$src

  temp_table  <-  get_bq_table(table, project, dataset)
  ref_table  <-  get_lazy_from_bq(temp_table, con)
}

And I can do things like that

tabla4  <-  tabla1 %>% 
  inner_join(tabla2, by = "customerID")  %>%
  bq_compute()
hadley commented

If compute() doesn't work, it should be made to work, because that's the official dbplyr syntax.

Uhm, I'm just update my packages, and compute works perfectly. Thanks @hadley . I close the issue with this comment

The problem was if I have a connection without dataset .

sc <- src_bigquery("my_project", dataset = NULL,  max_pages = 10)

tabla1  <-  sc  %>% tbl(I("churn.tabla1"))
tabla2  <-  sc  %>% tbl(I("churn.tabla2"))


# This works
tabla3 <- tabla1 %>% 
  inner_join(tabla2, by = "customerID") %>%
  compute()

# This not works. 
tabla3  %>% 
compute("churn.tabla3.", temporary = FALSE)

But if I use src_bigquery with dataset, all works fine

sc <- src_bigquery("my_project", dataset = "churn",  max_pages = 10)
hadley commented

Shouldn't that be tabla3 %>% compute(I("churn.tabla3"), temporary = FALSE) ? If that doesn't work, I'd say it's a bug.

Ups.. You're right. The I is neccesary

tabla3  %>% compute(I("churn.test"), temporary = FALSE)

load_tabla3  <- sc  %>% tbl(I("churn.test"))

load_tabla3  %>% head(6)
# Source:   SQL [6 x 41]
# Database: BigQueryConnection
  customerID gender_x SeniorCitizen_x Partner_x Dependents_x tenure_x
  <chr>      <chr>              <int> <lgl>     <lgl>           <int>
1 8883-GRDWQ Male                   1 FALSE     FALSE              20
2 1088-CNNKB Male                   0 TRUE      FALSE              63
3 2282-YGNOR Female                 0 FALSE     FALSE              29
4 2361-FJWNO Male                   0 FALSE     FALSE              40
5 3605-JISKB Male                   1 TRUE      FALSE              55
6 4003-OCTMP Female                 0 TRUE      FALSE              31

Thanks a lot. In next months I'll use this package in my job..