LearnCPP Downloader

Multi-threaded web scraper to download all the tutorials from www.learncpp.com and convert them to PDF files concurrently.

Support ❤️

Get the image

docker pull amalrajan/learncpp-download:latest

And run the container

docker run --rm --name=learncpp-download --mount type=bind,destination=/app/learncpp,source=/home/amalr/temp/downloads amalrajan/learncpp-download

Replace /home/amalr/temp/downloads with a local path on your system where you'd want the files to get downloaded.

You need Python 3.10 and wkhtmltopdf installed on your system.

Clone the repository

git clone https://github.com/amalrajan/learncpp-download.git

Install Python dependencies

cd learncpp-download
pip install -r requirements.txt

Run the script

scrapy crawl learncpp

You'll find the downloaded files inside learncpp directory under the repository root directory.

Rate Limit Errors:

High CPU Usage:

self.executor = ThreadPoolExecutor(
    max_workers=192
)  # Limit to 192 concurrent PDF conversions

Further Issues: