table identifier escaping causing dplyr interface errors
alistaire47 opened this issue · 1 comments
Not sure exactly what changed, but some version somewhere changed so I'm now getting errors when using the dplyr interface because it's putting double-quotes around the table identifier. Reprex:
> library(sergeant)
> tbl(src_drill('localhost'), 'cp.`employee.json`')
Request failed [500]. Retrying in 1 seconds...
Request failed [500]. Retrying in 2 seconds...
Request failed [500]. Retrying in 1.8 seconds...
Request failed [500]. Retrying in 1.7 seconds...
# Source: table<cp.`employee.json`> [?? x 8]
# Database: DrillConnection
Warning message:
PARSE ERROR: Encountered "FROM \"" at line 1, column 1.
SQL Query: SELECT *
FROM "cp.`employee.json`"
^
LIMIT 11
Original Query:
1: SELECT *
2: FROM "cp.`employee.json`"
3: LIMIT 11
Query Profile Error Link:
http://localhost:8047/profiles/772b3c22-8d04-43d5-b733-9e2ae0bcbb50
> con <- drill_connection('localhost')
If I try to run that query directly, it fails:
> drill_query(
+ con,
+ 'SELECT *
+ FROM "cp.`employee.json`"
+ LIMIT 11'
+ )
|======================================================================| 100%
Request failed [500]. Retrying in 1.4 seconds...
|======================================================================| 100%
Request failed [500]. Retrying in 1.9 seconds...
|======================================================================| 100%
Query ==> SELECT * FROM "cp.`employee.json`" LIMIT 11
PARSE ERROR: Encountered "FROM \"" at line 1, column 6.
SQL Query: SELECT *
FROM "cp.`employee.json`"
^
LIMIT 11
[Error Id: dc495019-3219-4b88-b5f8-9a4fd657dc0f ]
whereas if I remove the quotes around cp.`employee.json`
it works:
> drill_query(
+ con,
+ 'SELECT *
+ FROM cp.`employee.json`
+ LIMIT 11'
+ )
|======================================================================| 100%
# A tibble: 11 x 16
employee_id full_name first_name last_name position_id position_title
<dbl> <chr> <chr> <chr> <dbl> <chr>
1 1 Sheri No… Sheri Nowmer 1 President
2 2 Derrick … Derrick Whelply 2 VP Country Ma…
3 4 Michael … Michael Spence 2 VP Country Ma…
4 5 Maya Gut… Maya Gutierrez 2 VP Country Ma…
5 6 Roberta … Roberta Damstra 3 VP Informatio…
6 7 Rebecca … Rebecca Kanagaki 4 VP Human Reso…
7 8 Kim Brun… Kim Brunner 11 Store Manager
8 9 Brenda B… Brenda Blumberg 11 Store Manager
9 10 Darren S… Darren Stanz 5 VP Finance
10 11 Jonathan… Jonathan Murraiin 11 Store Manager
11 12 Jewel Cr… Jewel Creek 11 Store Manager
# … with 10 more variables: store_id <dbl>, department_id <dbl>,
# birth_date <chr>, hire_date <chr>, salary <dbl>, supervisor_id <dbl>,
# education_level <chr>, marital_status <chr>, gender <chr>,
# management_role <chr>
Results are the same in the Drill shell. I assume this is a result of a version or setting changing somewhere, but so far I can't locate it. What I'm running:
software | version |
---|---|
MacOS | 10.14.6 |
Apache Drill | 1.18.0 |
R | 4.0.3 |
sergeant | 0.9.0 |
dbplyr | 2.0.0 |
dplyr | 1.0.2 |
Happy to write a PR to fix this if this is something that can be fixed here, but so far I can't figure out what has to change.
Thank you for this! I use the caffeinated {sergeant} one b/c I have large resultsets and neglected to fix this {dplyr} regression in the REST package.