Implement domain categorization
Closed this issue · 2 comments
gunesacar commented
[from today's discussions] One of the alternatives for building a list of shopping sites is categorizing Alexa top 1m sites using an external (non-Alexa) API.
The following code can be used to query the Bluecoat API:
https://github.com/PoorBillionaire/sitereview/blob/master/sitereview.py
For instance it categorizes myntra.com
as a shopping site:
python sitereview.py https://www.myntra.com/
======================
Symantec Site Review
======================
URL: https://www.myntra.com:443/
Last Time Rated/Reviewed: > 7 days
Category: Shopping
gunesacar commented
When there are multiple categories, we miss all categories but the first one:
Example: http://www.game.co.uk/ Informational, Shopping and Games -> Informational