xianc/CS446-WebCrawler
A simple web crawler that generate a file of the first 100 unique links it finds, restricting the links to web pages and pdfs that are on cs.umass.edu and respecting robots.txt...
Java
No issues in this repository yet.
A simple web crawler that generate a file of the first 100 unique links it finds, restricting the links to web pages and pdfs that are on cs.umass.edu and respecting robots.txt...
Java
No issues in this repository yet.