- so when crawling example.com it would crawl all pages within the example.com domain, but not follow the links to Facebook or Instagram accounts or subdomains like cloud.example.com.
- showing which static assets each page depends on, and the links between pages. Choose the most appropriate data structure to store & display this site map.
-Install using "pip install reppy"
1. Modify the sitemap data structure to dumps to file after every crawl rather than store the entire site and dump at the end since it can grow really large
1.To check url parsing