DrKain/scrape-youtube

Unconfirmed CPU spikes

Closed this issue · 6 comments

Received report of extremely high CPU spikes from a single search. This will need to be confirmed before a fix is pushed to ensure the issue is resolved.

on the aws free tier ec2 it spikes up to >80%
and about the same on my 6700k

Is this still happening? I've forked your code and used request + cheerio instead, never had CPU spike, so maybe it's because of JSDOM.

return new Promise((resolve, reject) => {
    request({
        method: "GET",
        url: url
    }, (err, res, body) => {
        let results = [];
        const $ = cheerio.load(body);

        $(".yt-lockup").each((i, v) => {
            const $result = $(v);
            ...
        }

Pretty much what I was thinking too, Though I could never reproduce the spikes. JSDOM isn't exactly lightweight but I didn't expect something so drastic.
I'll be pushing a new release some time this week.
Nice fork.

I think I know what might causes this.

Having similar issue on my repo even though I'm using Cheerio.

TL:DR of that issue:
The scrapper is hogging the Node.js main thread when scrapping large amount of videos (like playlist, where it can scrape up to 100 videos on the single playlist), and causes the discord bot to stop playing audio for a while until it finishes scraping. I ended up having to implement useWorkerThread options, which basically scrape the page on worker thread instead of main thread, and the issue is fixed.

Is this still an issue you're having in the latest version?
I've since updated the package with significant improvements to both searching and parsing times as well as ditching Cheerio/JSDOM for https.

Haven't used this library since I made my own, but just want to let you know in case this issue still persists

Thanks for letting me know.