Slurp
Goal:
- Download comics from comicsworld website: https://comicsworld.in/manga/super-commando-dhruv/
Objectives
- Download each jpg page for each comic into its own repository.
- Convert all jpg's to pdf files for each comic
- Make this configurable so that if I want to download any super hero comic, I am able to do that (not just for dhruv)
How to use:
download this project on your local machine
Say clone this project from github on your c drive using below commands
cd c:/ git clone https://github.com/PramodKumarYadav/slurp.git
You would need git, jdk 8 and maven 3.8.1
To download all comics from a series.
- Go to
main -> resources -> application.conf
file. - Change the field
series="nagayan"
to something of your choice and something whose config is available in the resources directory. Such as sayvisharpi
whose config is available in resources; sayseries="nagayan"
- Now go to
test -> java -> slurp -> TestSeries
and run testgetAllComicsFromASeriesAsPDFs
. - This will download all comics into the directory
./comics/visharpi/
as pdfs and images.
To download a single comic from a series.
- Go to
main -> resources -> application.conf
file. - Change the field
series="visharpi"
to something of your choice and something whose config is available in the resources directory. Such as saynagayan
whose config is available in resources; sayseries="nagayan"
- Now go to this chosen series config file ->
main -> resources -> nagayan.conf
. - Change the field
singleComicUrl="whatever url is here..."
to something of your choice fromthe same series
. Something like this (taken from ther series page below):https://comicsworld.in/manga/read-complete-nagayan-series/7-samarkaand-nagayan/
- Now go to
test -> java -> slurp -> TestComics
and run testgetASingleComicAsPDF
. - This will download the comic whose URL you put in that series config into the directory, say
./comics/nagayan/
as pdfs and images.