fabiogiglietto/CooRnet

get_ctshares() issue

aqibufu opened this issue · 6 comments

Hi, Contributors of CooRnet, thank for your excellent work of CooRnet package. But I seem to meet a bug or problem. When I use the function get_ctshares() to get the sharing data, sometimes the process will be stuck in no warning or error. I have met this problem three time. First time, I stop the process and try it again. Second time, I stop the process and the R warning that "R is not responding to your request to interrupt processing so to stop the current operation you may need to teminate R entirely". And I select No, and the process was work again and continue to get the sharing data. But this time, I met this problem again, and it doesn't work again when I try to do the same thing as second time. I don't know why I met such kind of issue, maybe it's the API problem? But I have done another analysis before by using CooRnet, the api was work normal. Maybe you can help me work this out? I can simple stop the process and try again. But that will waste many time.
Thank!

Hi :)
the issue you describe may be realated to a lack of resources on your machine. Unfortunately r doesn't handle this cases quite well and just hangs. Could it this be the case? How many URLs are you starting from?

Hi :) the issue you describe may be realated to a lack of resources on your machine. Unfortunately r doesn't handle this cases quite well and just hangs. Could it this be the case? How many URLs are you starting from?

Hi, Fabio.
Considering that our data set is relatively large, I divided the data set into several parts. The first part is the links from 60,000 facebook posts. Actually,I have tried another dataset before, which is links from about 120,000 Facebook posts. But the urls sharing data of this dataset was successfully collected by get_ctshares() function. Now, the best way I can think of is to split the data set into several small data sets and hope that this problem does not occur on small data sets.
Thanks

it really boils down to the number of unique URLs shared by the posts you are starting from. A dataset of 60k Facebook posts may include a very different number of unique URLs depending on the type of posts collected. Please also keep in mind that due to a limit of CrowdTangle's API link endpoint some URLs (e.g. telegram bots) are incorrectly interpreted as a domain search. As a result, you may get back a large number of shares (CooRnet get up to 10k shares for each link) that are totally unrelated to your original URL.

it really boils down to the number of unique URLs shared by the posts you are starting from. A dataset of 60k Facebook posts may include a very different number of unique URLs depending on the type of posts collected. Please also keep in mind that due to a limit of CrowdTangle's API link endpoint some URLs (e.g. telegram bots) are incorrectly interpreted as a domain search. As a result, you may get back a large number of shares (CooRnet get up to 10k shares for each link) that are totally unrelated to your original URL.

Yes, I know the URLs are actually the unique URLs shared by the posts I'm starting from. :) The reminding is also make me surprise. I'm also met the error "unexpected http resoponse code 503 on call". Maybe 503 is the limit you remind me?
Now, I split the whole posts dataset into several small datasets and ask to increase the rate limit of my token.

Nope 503 is an http server error on CrowdTangle API side.

Nope 503 is an http server error on CrowdTangle API side.

Oh, So it is. Thanks