/Blog-Scraper-LeerComics

Experimental project to scrape a web page of comics and convert a comic into pdf

Primary LanguageC#MIT LicenseMIT

Blog-Scraper-LeerComics

This is an experimental project to scrape a web page of comics and convert a comic into pdf

Introduction

The web page https://leer-comics.blogspot.com/ contains old and abandoned comics in spanish language.

There are people including content there, for those old comics, some of them between 70 years old.

Read those comics in the web page is unuseful and you need to be connected into Internet, so this project is to demonstrate how to do a tool to download the content and gnerate a pdf file of the content, to read it easiy offline on your tablet, tv or other device.

Here, I want to show you:

  • How to scrape a web page
  • How to download the web page content
  • How to get the links or image urls of the web page content
  • How to download all the images into your computer
  • How to generate a pdf document of all those images downloaded

Programming Language

.NET Core 3.1

Packages used

HtmlAgilityPack

PdfSharpCore

Disclaimer notes

Be careful with the piracy actions

This is an experimental project to demonstrate how to scrape a web page easily, download the content of the web page, and convert part of the content into a pdf document, all using .NET Core 3.1

Remember respect the laws and the intellectual property of the websites and the content that you can find in them. If you hesitate with something, avoid do it.