A web crawler written in C
A bot or program which browses the web for the purpose of web indexing.
- libcurl: A library built for making HTTP requests.
- TidyLib: A library built for cleaning HTML pages. We used it to parse HTML and extract links.
Assuming libcurl and TidyLib are installed, simply run Make
and then you can execute ./main <url>
to start crawling.