Student reading carnival
All purpose search on books
- Linxuan Yang
- Ye Hong
- Chenfeng Fan
- Limian Guo
$ pip3 install requests
$ pip3 install Scrapy
-
Get book titles json file with category @goodreads
$ python goodreads/goodreads_title_crawler.py
-
Get literature json file @sparknotes
$ cd sparknotes
$ scrapy crawl titles
, this will get all book titles and links in literature tab$ scrapy crawl details
, this will get all data for each book
{'downloader/exception_count': 22, 'downloader/exception_type_count/twisted.internet.error.NoRouteError': 16, 'downloader/exception_type_count/twisted.internet.error.TimeoutError': 6, 'downloader/request_bytes': 5375772, 'downloader/request_count': 5985, 'downloader/request_method_count/GET': 5985, 'downloader/response_bytes': 48833687, 'downloader/response_count': 5963, 'downloader/response_status_count/200': 5667, 'downloader/response_status_count/404': 296, 'dupefilter/filtered': 9, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2019, 5, 9, 20, 52, 48, 906551), 'item_scraped_count': 579, 'log_count/DEBUG': 6565, 'log_count/INFO': 396, 'log_count/WARNING': 1, 'memusage/max': 78176256, 'memusage/startup': 50728960, 'request_depth_max': 12, 'response_received_count': 5963, 'retry/count': 22, 'retry/reason_count/twisted.internet.error.NoRouteError': 16, 'retry/reason_count/twisted.internet.error.TimeoutError': 6, 'robotstxt/request_count': 1, 'robotstxt/response_count': 1, 'robotstxt/response_status_count/200': 1, 'scheduler/dequeued': 5984, 'scheduler/dequeued/memory': 5984, 'scheduler/enqueued': 5984, 'scheduler/enqueued/memory': 5984, 'start_time': datetime.datetime(2019, 5, 9, 5, 29, 9, 874055)} 2019-05-09 16:52:48 [scrapy.core.engine] INFO: Spider closed (finished)
- what we achieve
- methods
- what we achieve
- methods\
- Clean and organize raw data from @sparknotes
$ index.py
\ - For each book, each key/field associate with a string of related text
- Clean and organize raw data from @sparknotes
- what we achieve
- methods
- Words in
""
: phrase search -word
: difference search only+word
: conjunctive search only- "More like this" button
- input auto completion function
- Words in
- running instruction
python3 query.py
- you need a proper SQLite command working on your computer to have access to our databse
- we use Bootstrap to build a progressive single page web application
- methods