/CS446-WebCrawler

A simple web crawler that generate a file of the first 100 unique links it finds, restricting the links to web pages and pdfs that are on cs.umass.edu and respecting robots.txt...

Primary LanguageJava

No issues in this repository yet.