moj-analytical-services/dbtools

array and struct errors

isichei opened this issue · 2 comments

From AP slack:

Does dbtools::get_athena_query_response() allow for querying of ad hoc arrays?
eg. should "SELECT ARRAY[1, 2, 3] AS list" return anything? I'm getting
Error in py_call_impl(callable, dots$args, dots$keywords) : KeyError: 'array'

On replication traceback is...

 Error in py_call_impl(callable, dots$args, dots$keywords) : 
  KeyError: 'array' 

4. | stop(structure(list(message = "KeyError: 'array'", call = py_call_impl(callable,      dots$args, dots$keywords), cppstack = structure(list(file = "",      line = -1L, stack = c("/home/isichei/R/library/reticulate/libs/reticulate.so(Rcpp::exception::exception(char const*, bool)+0x84) [0x7f24832581f4]",      "/home/isichei/R/library/reticulate/libs/reticulate.so(Rcpp::stop(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x27) [0x7f2483258347]",  ...

3. | get_athena_query_response at <string>#57

2. | get_athena_query_response(sql_query = sql_query, bucket = bucket,      output_folder = output_folder, return_athena_types = return_athena_types,      timeout = timeout)

1. | dbtools::get_athena_query_response("SELECT ARRAY[1, 2, 3] AS list",      "alpha-everyone")

☝️ My assumption is that the underlying python script is trying to take each Athena data type and converting it to our agnostic data. However, as our agnostic data doesn't (yet) support STRUCT or ARRAY you're getting that key error as there is no array key in the lookup dictionary.

Running the following

dbtools::get_athena_query_response('SELECT ARRAY[1, 2, 3] AS list', 'alpha-everyone', return_athena_types=T)

Works and therefore supports the theory above. Probably should error out for the time being but the error should be more informative.

For the record - you can turn the ARRAY into a STRUCT which dbtools is happy to return, so the functionality of arrays can be used in Athena queries.
To return a table with

n
0
1
2
3
Use:
SELECT t.n
FROM (
  SELECT sequence(0,3) AS n
)
CROSS JOIN UNNEST(n) AS t (n);