[BUG] Defensive programming: build_query
chainsawriot opened this issue · 2 comments
Please confirm the following
- I have searched the existing issues
- The behaviour of the program is deviated from what is described in the documentation.
- I can reproduce this problem for more than one time.
- This is NOT a 3-digit error -- it does not display an error message like
something went wrong. Status code: 400.
- This is a 3-digit error and I have consulted the Understanding API errors vignette and the suggestions do not help.
Describe the bug
Although it is quite clear that build_query
(and functions depending on it, e.g. get_all_tweets
) accepts character and character vector, it is not enforced by the program. One symptom of it is that build_query
can actually accept a 1-d data frame and weirdness like #230 arises. Also, it generates some queries that surely will give a 400 error.
Expected Behavior
Two solutions: coerce anything to vector or simply raise an error when the input of build_query
is not character or character vector. My preferred solution is the latter.
Steps To Reproduce
require(academictwitteR)
#> Loading required package: academictwitteR
x <- data.frame(username = c("a", "b", "c"))
build_query(query = x)
#> username
#> 1 a
#> 2 b
#> 3 c
build_query(users = x)
#> [1] " (from:c(\"a\", \"b\", \"c\"))"
build_query(reply_to = x)
#> [1] " (to:c(\"a\", \"b\", \"c\"))"
build_query(retweets_of = x)
#> [1] " (retweets_of:c(\"a\", \"b\", \"c\"))"
get_all_tweets(users = x, start_tweets = "2021-01-01Z00:00:00", end_tweets = "2021-02-01Z00:00:00")
#> Warning: Recommended to specify a data path in order to mitigate data loss when
#> ingesting large amounts of data.
#> Warning: Tweets will not be stored as JSONs or as a .rds file and will only be
#> available in local memory if assigned to an object.
#> query: (from:c("a", "b", "c"))
#> Error in make_query(url = endpoint_url, params = params, bearer_token = bearer_token, : something went wrong. Status code: 400
Created on 2021-08-24 by the reprex package (v2.0.0)
Environment
#> R version 4.1.1 (2021-08-10)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04.2 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] knitr_1.33 magrittr_2.0.1 rlang_0.4.11 fansi_0.5.0
#> [5] stringr_1.4.0 styler_1.4.1 highr_0.9 tools_4.1.1
#> [9] xfun_0.24 utf8_1.2.2 withr_2.4.2 htmltools_0.5.1.1
#> [13] ellipsis_0.3.2 yaml_2.2.1 digest_0.6.27 tibble_3.1.3
#> [17] lifecycle_1.0.0 crayon_1.4.1 purrr_0.3.4 vctrs_0.3.8
#> [21] fs_1.5.0 glue_1.4.2 evaluate_0.14 rmarkdown_2.9
#> [25] reprex_2.0.0 stringi_1.7.3 compiler_4.1.1 pillar_1.6.2
#> [29] backports_1.2.1 pkgconfig_2.0.3
Anything else?
No response
Thanks for raising this. I think the best option here is to raise an error when input is not character vector, as you say. This has the added benefit of enforcing proper usage (e.g., not passing a data.frame object to build_query()
)
#349 is also a manifestation of this because a "column" in a tibble is not a vector.