the scraper skip some tweets when using --until
minamotorin opened this issue · 4 comments
I found that this part of the code in url.py
if "win" in platform:
return f'\"{date.split()[0]}\"'
sometimes makes the scraper skip some tweets when using --until "%y-%m-%d %H:%M:%S" on Windows. It starts from some hours before the specified one. Removing these lines seems to achieve better results.
Originally posted by @Tortar in #8 (comment)
@Tortar
The code you pointed out was originally added as a workaround for twintproject#597.
When I fixed twintproject#1136, I changed the one (188e521).
But I doesn't remove the code because I can't deny the possibility that twintproject#597 has recur.
If someone helps debugging on Windows and twintproject#597 resolves, I would be able to remove the code.
I will work on it, for now --until seems to work fine while --since has some problem
Actually...on Windows works perfectly without
if "win" in platform:
return f'\"{date.split()[0]}\"'
I tried
twint -s pineapple --until "2020-11-09 9:00:00" --since "2020-11-09 1:00:00"
and
twint -s pineapple --until "2020-11-09" --since "2020-11-08"
and they work as expected, stopping at the right point. So, I think deleting those lines could be an improvement.