philbot9/youtube-comment-scraper

Parallelize job

Closed this issue · 2 comments

Hey,

Is there a way to speed-up the process of scraping comments e.g. by parallelizing it?

Hi there,

It's an interesting idea, but as far as I'm aware it's not possible to parallelise most of this. All comments are grouped into pages and each page contains a token that is needed to get the next page. So the pages have to be scraped sequentially.

I believe there is some room for improvement where the next page of comments could be fetched while parsing the current page of comments. Also, the replies to comments on a page have to be fetched separately. That could certainly be parallelised. I don't recall whether it already is or not.

All in all, I don't think the speed improvements would be major, but you're more than welcome to take a look at https://github.com/philbot9/youtube-comments-task. That code is responsible for fetching and parsing the comments. I'm always happy about Pull Requests. 😉

Closing due to inactivity.