Scraping book recommendations for fun and profit.
In general, this follows the process in Peter Christens' excellent book Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution and Duplicate Detection for both deduplication and linkage.
A collection of tools for use across sources and projects.
Book recommendations from The Tim Ferriss Show, helpfully collated by The Books of Titans Project.
- tap-books-of-titans is a scraper for collecting the full list of recommendations.
- clean-books-of-titans contains notebooks for deduplicating the raw list.
An attempt at joining Books of Titans data with Goodreads bookshelf data.