This repo contains patches to enhance popular web automation libraries. Specifically, it targets the puppeteer
and playwright
packages.
Some aspects of automation libraries or browser behavior cannot be adjusted through settings or command-line switches. Therefore, we fix these issues by patching the library's source code. While this approach is fragile and may break as the libraries' source code changes over time, the goal is to maintain this repo with community help to keep the patches up to date.
Popular automation libraries rely on the CDP command Runtime.Enable
, which allows receiving events from the Runtime.
domain. This is crucial for managing execution contexts used to evaluate JavaScript on pages, a key feature for any automation process.
However, there's a technique that detects the usage of this command, revealing that the browser is controlled by automation software like Puppeteer or Playwright. This technique is used by all major anti-bot software such as Cloudflare, DataDome, and others.
We've prepared a full article about our investigation on this leak, which you can read in our blog.
For more details on this technique, read DataDome's blog post: How New Headless Chrome & the CDP Signal Are Impacting Bot Detection.
In brief, it's a few lines of JavaScript on the page that are automatically called if Runtime.Enable
was used.
Our fix disables the automatic Runtime.Enable
command on every frame. Instead, we manually create contexts with unknown IDs when a frame is created. Then, when code needs to be executed, we have implemented two approaches to get the context ID. You can choose which one to use.
1. Create a new isolated context via Page.createIsolatedWorld
and save its ID from the CDP response.
🟢 Pros: All your code will be executed in a separate isolated world, preventing page scripts from detecting your changes via MutationObserver. For more details, see the execution-monitor test.
🔴 Cons: You won't be able to access main context variables and code. While this is necessary for some use cases, the isolated context generally works fine for most scenarios. Also, web workers don't allow creating new worlds, so you can't execute your code inside a worker. This is a niche use case but may matter in some situations.
This triggers Runtime.executionContextCreated
events, allowing us to catch the proper context ID.
🟢 Pros: You will have full access to the main context.
🔴 Cons: There's a slight chance that during this short timeframe, the page will call code that leads to the leak. The risk is low, as detection code is usually called during specific actions like CAPTCHA pages or login/registration forms, typically right after the page loads. Your business logic is usually called a bit later.
🎉 Our tests show that both approaches are currently undetectable by Cloudflare or DataDome.
Important: After applying the patch, you need to enable it by setting REBROWSER_PATCHES_RUNTIME_FIX_MODE
environment variable. This allows you to easily switch between patched and non-patched versions based on your business logic.
REBROWSER_PATCHES_RUNTIME_FIX_MODE=alwaysIsolated
— always run all scripts in isolated contextREBROWSER_PATCHES_RUNTIME_FIX_MODE=enableDisable
— use Enable/Disable techniqueREBROWSER_PATCHES_DEBUG=1
— enable some debugging messages
Remember, you can set these variables in different ways, for example, in code:
process.env.REBROWSER_PATCHES_RUNTIME_FIX_MODE = "alwaysIsolated"
or in command line:
REBROWSER_PATCHES_RUNTIME_FIX_MODE=alwaysIsolated node app.js
To test this leak, you can use this page: https://kaliiiiiiiiii.github.io/brotector/ (sources)
Before patch 👎 | After patch 👍 |
---|---|
This package is designed to be run against an installed library. Install the Puppeteer library, then call the patcher, and it's ready to go.
In the root folder of your project, run:
npx rebrowser-patches@latest patch
You can easily revert all changes with this command:
npx rebrowser-patches@latest unpatch
You can also patch a package by providing the full path to its folder, for example:
npx rebrowser-patches@latest patch --packagePath /web/app/node_modules/puppeteer-core-custom
You can see all command-line options by running npx rebrowser-patches@latest --help
, but currently, there's just one patch for one library, so you don't need to configure anything.
npm install
or yarn install
in your project folder, it might override all the changes from the patches. You'll need to run the patcher again to keep the patches in place.
If you already have your package patched and want to update to the latest version of rebrowser-patches, the easiest way would be to delete node_modules/puppeteer-core
, then run npm install
, and then run npx rebrowser-patches@latest patch
.
Pptr Ver | Release Date | Chrome Ver | Patch Support |
---|---|---|---|
23.2.x | 2024-08-29 | 128 | ✅ |
23.1.x | 2024-08-14 | 127 | ✅ |
23.0.x | 2024-08-07 | 127 | ✅ |
22.15.x | 2024-07-31 | 127 | ✅ |
22.14.x | 2024-07-25 | 127 | ✅ |
22.13.x | 2024-07-11 | 126 | ✅ |
22.12.x and below |
2024-06-21 | 126 | ❌ |
Currently, this repo contains only a patch for the latest Puppeteer version. Creating these patches is time-consuming as it requires digging into someone else's code and changing it in ways it wasn't designed for.
📣 If we see demand from the community for Playwright support, we'll be happy to allocate more resources to this mission. Please provide your feedback in the issues section.
We're currently developing more patches to improve web automation transparency, which will be released in this repo soon. Please support the project by clicking ⭐️ star or watch button.
💭 If you have any ideas, thoughts, or questions, feel free to reach out to our team by email or use the issues section.
Always keep in mind: the less you manipulate browser internals via JS injections, the better. There are ways to detect that internal objects such as console, navigator, and others were affected by Proxy objects or Object.defineProperty. It's tricky, but it's always a cat-and-mouse game.
If you've tried everything and still face issues, try asking a question in the issues section or consider using cloud solutions from Rebrowser.
This package is sponsored and maintained by Rebrowser. We allow you to scale your automation in the cloud with hundreds of unique fingerprints.
Our cloud browsers have great success rates and come with nice features such as notifications if your library uses Runtime.Enable
during execution or has other red flags that could be improved. Create an account today to get invited to test our bleeding-edge platform and take your automation business to the next level.
When you try to run this patcher on a Windows machine, you will probably encounter an error because the patch command is not found. To fix this, you need to install Git, which includes patch.exe. After you have installed it, you need to add it to your PATH:
set PATH=%PATH%;C:\Program Files\Git\usr\bin\
You can check that patch.exe is installed correctly by using next command:
patch -v