
not very long query (less than 1024 characters) returns 400 error

polisci-quant-nerd opened this issue · 1 comments

Please confirm the following

  • I have searched the existing issues
  • The behaviour of the program is deviated from what is described in the documentation.
  • I can reproduce this problem for more than one time.
  • This is NOT a 3-digit error -- it does not display an error message like something went wrong. Status code: 400.
  • This is a 3-digit error and I have consulted the Understanding API errors vignette and the suggestions do not help.

Describe the bug

Hi, I built a query that worked perfectly before. But since this month, when I try to get more tweets based on the same query, I got the 400 error. I don't know why the query stopped working. I checked how many characters it has, only 518. Just some terms with AND OR operators. And I don't think it contains any non-recognisable terms. And it doesn't matter how big n I set; it always returning the same error. I just couldn't figure out what's the problem with my old query.

query <- build_query(query = "((#EU OR #EuropeanUnion OR \"EU\" OR \"European Union\" OR \"EuropeanUnion\" OR \"European Comission\" OR #EuropeanComission OR #EUCommission OR \"European Central Bank\" OR \"ECB\" OR #ECB OR @ECB OR @EU_Comission OR \"European Parliament\" OR \"European Council\" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR \"corona\" OR \"covid\" OR \"pandemic\" OR \"coronavirus\" OR #pandemic OR #coronavirus or \"public health\"))", is_retweet = FALSE, lang = "en")

                                   query = query,
                                   bind_tweets = FALSE,
                                   file = "mar_aug_2020_eng",
                                   data_path = "~/covid/1_Data/tweet_json/",
                                   n = Inf)

Expected Behavior

it shoud be working as before.

Steps To Reproduce

> query <- build_query(query = "((#EU OR #EuropeanUnion OR \"EU\" OR \"European Union\" OR \"EuropeanUnion\" OR \"European Comission\" OR #EuropeanComission OR #EUCommission OR \"European Central Bank\" OR \"ECB\" OR #ECB OR @ECB OR @EU_Comission OR \"European Parliament\" OR \"European Council\" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR \"corona\" OR \"covid\" OR \"pandemic\" OR \"coronavirus\" OR #pandemic OR #coronavirus or \"public health\"))",  is_retweet = FALSE, lang = "en")

                                    query = query,
                                    bind_tweets = FALSE,
                                    file = "EU_mar_aug_2020_eng",
                                    data_path = "~/Desktop/1_Data/tweet_json/",
                                    n = Inf)
query:  ((#EU OR #EuropeanUnion OR "EU" OR "European Union" OR "EuropeanUnion" OR "European Comission" OR #EuropeanComission OR #EUCommission OR "European Central Bank" OR "ECB" OR #ECB OR @ECB OR @EU_Comission OR "European Parliament" OR "European Council" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR "corona" OR "covid" OR "pandemic" OR "coronavirus" OR #pandemic OR #coronavirus or "public health")) -is:retweet lang:en 
Error in make_query(url = endpoint_url, params = params, bearer_token = bearer_token,  : 
  something went wrong. Status code: 400
In addition: Warning messages:
1: Tweets will still be bound in local memory to generate .rds file. Argument (bind_tweets = FALSE) only valid when just a data path has been specified. 
2: Directory already exists. Existing JSON files may be parsed and returned, choose a new path if this is not intended. 


R version 4.1.1 (2021-08-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.3

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

Anything else?

No response

I believe the error is in the last or, which should be capitalized as OR. The following works:

query <- build_query(query = "((#EU OR #EuropeanUnion OR \"EU\" OR \"European Union\" OR \"EuropeanUnion\" OR \"European Comission\" OR #EuropeanComission OR #EUCommission OR \"European Central Bank\" OR \"ECB\" OR #ECB OR @ECB OR @EU_Comission OR \"European Parliament\" OR \"European Council\" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR \"corona\" OR \"covid\" OR \"pandemic\" OR \"coronavirus\" OR #pandemic OR #coronavirus OR \"public health\"))", is_retweet = FALSE, lang = "en")

tweet_mar_aug_2020<-get_all_tweets(query = query,
                                   n = Inf)