Crawler

The Crawler has been written as my submission for Coursework 5 for Birkbeck's Programming in Java module.

The Crawler is a web crawler that can search webpages from a given URL to a set number of links or a set depth of pages, all of which can be provided by the user.

The main method for this application is the class Crawler which you should define as your main class in your IDE if you compile it.

The Crawler is an application with a minimal text based user interface through which the user can define their starting URL and other parameters.

The crawler does not accept any arguments from the command line currently although this is planned as a second version feature which will enable search terms to be passed to the application.

The Crawler has been written to use the javaDB derby database class. You will need to ensure that you have your path set to a Derby folder on your hard drive in order to compile this application. More information abou the Derby database can be found at the following link:

http://www.oracle.com/technetwork/java/javadb/overview/index.html

The Crawler uses an embedded javaDB completely in memory. Results from searches therefore do not persist between sessions. In this first version of the crawler the results of the search are only displayed in the system console.

Please contact James Hill for further questions about this crawler.