ropensci-archive/rtweet

Get the text in the image description of a tweet

Closed this issue · 7 comments

Problem

I am trying to get the text in the image description of a tweet ext_alt_text but {rtweet} always returns NA.

Expected behavior

The code in the example below should return a string with the word "sort" and a number, like "sort 893".

Reproduce the problem

library(rtweet)

# parsing the tweet data
last_tweet_parsed <- rtweet::get_timeline(user = 'esquinadobrasil',
                                          n = 1,
                                          parse = T
                                          )
# getting the media_alt_text
entities <- last_tweet_parsed$entities

entities[[1]]$media$ext_alt_text
#>[1] NA

rtweet version

## copy/paste output
packageVersion("rtweet")
[1] ‘1.0.2

Session info

## copy/paste output
sessionInfo()
> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] gdalio_0.0.1.9014 remotes_2.4.2     rtweet_1.0.2      terra_1.6-47      sf_1.0-9         

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.9         pillar_1.8.1       compiler_4.1.1     prettyunits_1.1.1 
 [5] class_7.3-19       tools_4.1.1        progress_1.2.2     bit_4.0.5         
 [9] tibble_3.1.8       jsonlite_1.8.4     memoise_2.0.1      lifecycle_1.0.3   
[13] pkgconfig_2.0.3    rlang_1.0.6        DBI_1.1.3          cli_3.5.0         
[17] curl_4.3.3         fastmap_1.1.0      e1071_1.7-12       s2_1.1.1          
[21] vapour_0.9.2       withr_2.5.0        httr_1.4.4         vctrs_0.5.1       
[25] askpass_1.1        hms_1.1.2          bit64_4.0.5        classInt_0.4-8    
[29] grid_4.1.1         glue_1.6.2         R6_2.5.1           fansi_1.0.3       
[33] magrittr_2.0.3     codetools_0.2-18   ellipsis_0.3.2     units_0.8-1       
[37] mime_0.12          renv_0.16.0        utf8_1.2.2         KernSmooth_2.23-20
[41] proxy_0.4-27       wk_0.7.1           openssl_2.0.5      cachem_1.0.6      
[45] crayon_1.5.2  
llrs commented

I'll double check if there is something wrong with the parsing. Is there something in the response when parse = FALSE?

Thank you for looking into this and thanks for a great package, @llrs!

When using parse = FALSE, the output bings no info either. The variable is not enven there.

last_tweet <- rtweet::get_timeline(user = 'esquinadobrasil', 
                                   n= 1, 
                                   parse = F
                                   )

last_tweet[[1]][[1]]$entities$media

> [[1]]
>            id              id_str  indices                                      media_url
> 1 1.61501e+18 1615009514297729024 174, 197 http://pbs.twimg.com/media/FmmqmLiXoAAdEmw.jpg
>                                   media_url_https                     url
> 1 https://pbs.twimg.com/media/FmmqmLiXoAAdEmw.jpg https://t.co/z1YDyTJArx
>                  display_url
> 1 pic.twitter.com/z1YDyTJArx
>                                                             expanded_url  type sizes.thumb.w
> 1 https://twitter.com/esquinadobrasil/status/1615009611186069504/photo/1 photo           150
>   sizes.thumb.h sizes.thumb.resize sizes.small.w sizes.small.h sizes.small.resize
> 1           150               crop           549           680                fit
>   sizes.medium.w sizes.medium.h sizes.medium.resize sizes.large.w sizes.large.h
> 1            970           1200                 fit          1655          2048
>   sizes.large.resize
> 1                fit
llrs commented

Mmh, it seems that they are doing some changes in the API that affect the API v1. Perhaps it doesn't affect the API v2 (As far as I know, all development in the backend affects the public/all the apps).

I'll check and see if I need to redouble the efforts to provide support for the new API. Thanks for your kind words.

Thanks so much! for the record, I had opened a question on SO.

llrs commented

Ok, as expected I see that the API v2 works well...

llrs commented

In devel version 1.1.0.9006 or in next release it will be possible to do this:

gt <- tweet_get("1615009611186069504",
                expansions = 'attachments.media_keys',
                fields = set_fields(media = "alt_text", NULL, NULL, NULL, NULL),
                parse = FALSE)
gt[[1]]$includes$media[[1]]$alt_text
## [1] "sort 893"

Closing now (but still working in how to parse the data to make it easier to access it)

thanks