
First web scraping experiment (shell and Python)

Primary LanguagePython


Wanted to investigate Goodreads' categories numbers and play a little bit with Python's html parsing libraries (Beautiful soup in this case)

To download book categories html from Goodreads:


Then to retrieve data and popuate a CSV with these data:


or to do both:

./download_script && ./assemble_csv


examples: In the examples folder diagrams with most and least popular categories (after placing generated CSV to Google Doc's spreadsheet.

list_html: Downloaded files. Commited folder's content in case anyone wants to experiment without retrieving data.


Did not explore Goodreads API as was more interested in experimenting with web scraping.