not very long query (less than 1024 characters) returns 400 error
polisci-quant-nerd opened this issue · 1 comments
Please confirm the following
- I have searched the existing issues
- The behaviour of the program is deviated from what is described in the documentation.
- I can reproduce this problem for more than one time.
- This is NOT a 3-digit error -- it does not display an error message like
something went wrong. Status code: 400.
- This is a 3-digit error and I have consulted the Understanding API errors vignette and the suggestions do not help.
Describe the bug
Hi, I built a query that worked perfectly before. But since this month, when I try to get more tweets based on the same query, I got the 400 error. I don't know why the query stopped working. I checked how many characters it has, only 518. Just some terms with AND OR operators. And I don't think it contains any non-recognisable terms. And it doesn't matter how big n I set; it always returning the same error. I just couldn't figure out what's the problem with my old query.
query <- build_query(query = "((#EU OR #EuropeanUnion OR \"EU\" OR \"European Union\" OR \"EuropeanUnion\" OR \"European Comission\" OR #EuropeanComission OR #EUCommission OR \"European Central Bank\" OR \"ECB\" OR #ECB OR @ECB OR @EU_Comission OR \"European Parliament\" OR \"European Council\" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR \"corona\" OR \"covid\" OR \"pandemic\" OR \"coronavirus\" OR #pandemic OR #coronavirus or \"public health\"))", is_retweet = FALSE, lang = "en")
tweet_mar_aug_2020<-get_all_tweets(bearer_token=bearer_token_granted,
query = query,
start_tweets="2020-03-01T00:00:00Z",
end_tweets="2020-08-01T00:00:00Z",
bind_tweets = FALSE,
file = "mar_aug_2020_eng",
data_path = "~/covid/1_Data/tweet_json/",
n = Inf)
Expected Behavior
it shoud be working as before.
Steps To Reproduce
> query <- build_query(query = "((#EU OR #EuropeanUnion OR \"EU\" OR \"European Union\" OR \"EuropeanUnion\" OR \"European Comission\" OR #EuropeanComission OR #EUCommission OR \"European Central Bank\" OR \"ECB\" OR #ECB OR @ECB OR @EU_Comission OR \"European Parliament\" OR \"European Council\" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR \"corona\" OR \"covid\" OR \"pandemic\" OR \"coronavirus\" OR #pandemic OR #coronavirus or \"public health\"))", is_retweet = FALSE, lang = "en")
tweet_mar_aug_2020<-get_all_tweets(bearer_token=bearer_token_granted,
query = query,
start_tweets="2020-03-01T00:00:00Z",
end_tweets="2020-08-01T00:00:00Z",
bind_tweets = FALSE,
file = "EU_mar_aug_2020_eng",
data_path = "~/Desktop/1_Data/tweet_json/",
n = Inf)
query: ((#EU OR #EuropeanUnion OR "EU" OR "European Union" OR "EuropeanUnion" OR "European Comission" OR #EuropeanComission OR #EUCommission OR "European Central Bank" OR "ECB" OR #ECB OR @ECB OR @EU_Comission OR "European Parliament" OR "European Council" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR "corona" OR "covid" OR "pandemic" OR "coronavirus" OR #pandemic OR #coronavirus or "public health")) -is:retweet lang:en
Error in make_query(url = endpoint_url, params = params, bearer_token = bearer_token, :
something went wrong. Status code: 400
In addition: Warning messages:
1: Tweets will still be bound in local memory to generate .rds file. Argument (bind_tweets = FALSE) only valid when just a data path has been specified.
2: Directory already exists. Existing JSON files may be parsed and returned, choose a new path if this is not intended.
Environment
R version 4.1.1 (2021-08-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.3
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
Anything else?
No response
I believe the error is in the last or
, which should be capitalized as OR
. The following works:
query <- build_query(query = "((#EU OR #EuropeanUnion OR \"EU\" OR \"European Union\" OR \"EuropeanUnion\" OR \"European Comission\" OR #EuropeanComission OR #EUCommission OR \"European Central Bank\" OR \"ECB\" OR #ECB OR @ECB OR @EU_Comission OR \"European Parliament\" OR \"European Council\" OR #EuropeanParliament OR #EuroParl OR #EUParl OR #EuropeanCouncil OR #EUCouncil OR #EUCO OR @EUROParl OR @EUCouncil) (#corona OR #covid OR \"corona\" OR \"covid\" OR \"pandemic\" OR \"coronavirus\" OR #pandemic OR #coronavirus OR \"public health\"))", is_retweet = FALSE, lang = "en")
tweet_mar_aug_2020<-get_all_tweets(query = query,
start_tweets="2020-03-01T00:00:00Z",
end_tweets="2020-08-01T00:00:00Z",
n = Inf)