the-convocation/twitter-scraper

scraper.getTweet returns 404

VanillaDevelop opened this issue · 10 comments

On v0.5.0, using code that worked 2 days ago, I now only get 404 errors when using scraper.getTweet. For example, trying to retrieve https://twitter.com/elonmusk/status/1689963696703848449. Tested on a VPN to check if I was blocked or something but same result. Headers also seem to indicate that it's not a rate limit issue ('x-rate-limit-remaining': [ '499' ])

Can you replicate this or is it a local issue? I'd imagine it's just part of the ongoing Elon Musk self-destruct situation, but maybe you have an idea of how to fix it or what causes it.

Using code

import { Scraper } from "@the-convocation/twitter-scraper";
const scraper = new Scraper()
console.log(scraper.getTweet("1689963696703848449"))

Error (I removed some parts from the passthrough and the URL, but most of the information is there)

ApiError: Response status: 404
    at requestApi (A:\Code\test\node_modules\@the-convocation\twitter-scraper\dist\api.js:57:18)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async getTweet (A:\Code\test\node_modules\@the-convocation\twitter-scraper\dist\tweets.js:105:17) {
  response: Response {
    [...]
    [Symbol(Response internals)]: {
      url: 'https://api.twitter.com/graphql/[...]',
      status: 404,
      statusText: 'Not Found',
      headers: Headers {
        [Symbol(map)]: [Object: null prototype] {
          date: [ 'Wed, 16 Aug 2023 17:22:25 UTC' ],
          perf: [ '7626143928' ],
          server: [ 'tsa_o' ],
          'cache-control': [ 'no-cache, no-store, max-age=0' ],
          'content-length': [ '0' ],
          'x-transaction-id': [ 'b0234b43ce9dcd0f' ],
          'x-rate-limit-limit': [ '500' ],
          'x-rate-limit-reset': [ '1692207445' ],
          'x-rate-limit-remaining': [ '499' ],
          'strict-transport-security': [ 'max-age=631138519' ],
          'x-response-time': [ '109' ],
          'x-connection-hash': [
            'c2e3a1e959df9fac9591a6faaf7e126385a8ca2a0f3c36589d55e74e60b0ec96'
          ],
          connection: [ 'close' ]
        }
      },
      counter: 0
    }
  }
}

I have the same problem with getTweets.
I've tried to fix and changed this url

`https://api.twitter.com/graphql/u7wQyGi6oExe8_TRWGMq4Q/UserResultByScreenNameQuery?${params.toString()}`,
to https://twitter.com/i/api/graphql/SAMkL5y_N9pmahSw8yy6gw/UserByScreenName
I'm not sure if it's a correct URL but at least it doesn't fail. But now this line fails
`https://api.twitter.com/graphql/8IS8MaO-2EN6GZZZb8jF0g/UserWithProfileTweetsAndRepliesQueryV2?${params.toString()}`,

and I have no idea how to find a correct url

getProfile too

From other projects it appears using the guest token to fetch from these APIs no longer works and you'd have to switch to oauth1.0. Nitter for example is generating temporary anonymous guest users and hitting the above apis with oauth key/secrets.

There are still some tweet-related functions that are broken, but most of the breakages have been fixed in v0.6.0. Looks like it was entirely just API endpoints that were changed/added/removed all at once. I still need to figure out what's up with replies, though (/TweetDetail when logged-in and viewing a tweet), contributions welcome 👀

More fixes in #62, the only thing that still isn't working is tweet threads.

Owen3H commented

Unrelated to threads, but I've seen other libraries using this endpoint:
/i/api/graphql/naCjgapXCSCsbZ7qnnItQA/ListLatestTweetsTimeline

May or may not be useful, thought I'd share it anyways.

@dandyraka have you figured out the problem for getProfile? I'm having troubles currently as the wait time is very long

@Owen3H what other libraries? Can you share few links?

Owen3H commented

@Owen3H what other libraries? Can you share few links?

I got that specific query from Rettiwt, but there are some others using the ListLatestTweetsTimeline endpoint also.

twitter-openapi-typescript
OldTwitter (JS/Web Extension)
twscrape (Python)