sayem314/hooman

Reduce dependency overhead

Opened this issue · 4 comments

jsdom and user-agents bloat this package to 6.2 MB minified.
I suggest making the user agent user supplied and replacing jsdom with cheerio or something like this.

JSDOM can't be replaced with Cheerio. The goal of JSDOM is to provide an identical DOM environment as what we see in the browser. And about user-agents, yes I might remove this in the future.

Thanks for the answer.
Is jsdom actually used for anything but parsing the html?
I didn't see this when checking the code.

It's used here both for retrieving form data and solving challenges from Cloudflare.

hooman/lib/core.js

Lines 11 to 33 in fd80462

// Emulate browser dom
const { document } = new JSDOM(html, { url, referrer: url }).window;
// Parse script to execute
let jschl_answer = document.getElementsByTagName('script')[0].textContent;
jschl_answer = jschl_answer.substring(jschl_answer.indexOf('var s'));
jschl_answer = jschl_answer.substring(0, jschl_answer.indexOf('f.submit'));
jschl_answer = jschl_answer.replace('location.', 'document.location.');
// Retrive answers from form to submit challenge
const body = [];
vm.runInNewContext(
`${jschl_answer}
// Retrive answers from form to submit challenge
const input = document.getElementsByTagName("input");
for (const i of input) {
body.push(i.name + "=" + encodeURIComponent(i.value));
}`,
{ document, body }
);
// Parse request information
const { method, action, enctype } = document.getElementById('challenge-form');

@sayem314 can I suggest you use top-user-agents?

It's just a JSON of user agents, updated every time you fetch the package.

849 bytes of package 🙂