dcts/opensea-scraper

docs should warn that browserInstance requires stealth plugin

mlarcher opened this issue · 6 comments

I need to provide a browserInstance as my code runs in an alpine docker container. I was having weird issues, like empty resultsets for offersByScrolling and an error for offers because a split on some part of the html could not be found...
I eventually found out that my browserInstance was detected by opensea, which was serving a warning page instead of the content and tripping opensea-scraper
Don't you thing the doc should have a warning note about browserInstance explaining that it needs to implement puppeteer-extra-plugin-stealth ?

dcts commented

@mlarcher Valid point!

Just as explenation, the idea behind providing your own puppeteer instance is that you can:

  1. fully customize the puppeteer instance
  2. and if you find a solution without stealth plugin, you could implement a "stealth-less" version

But I agree a warning that stealth plugin is most likely needed would not hurt at all. But I would prefere to not throw an error because of above reason. If you want you can make a PR and add a warning or otherwise I'll add it to my queue.

Also may I ask why you used a custom puppeteer instance instead of using the default one? :)

I am using the scraper in an Alpine based docker container. This implies that I need to control the puppeteer version so that its requirements are met by the image. Typically, puppeteer version will have to be less than the latest one. See https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md#running-on-alpine for details

For info, after some tweaks i managed to have the sandbox working and I don't need to use browserInstance anymore

Hey there, I think the best way to detect if the stealth plugin is active on a browser instance is to check the chrome launch args, which can be accessed through the browser object:

  const isUsingStealth = (browser._process?.spawnargs || []).includes(
    "--disable-blink-features=AutomationControlled"
  );
dcts commented

@berstend thank you so much, that works! :)

Was just wondering how this works? Is the spawnarg --disable-blink-features=AutomationControlled only used by the stealth plugin and thats how we can know it was launched with that plugin? 🤔

dcts commented

fixed in PR #37