/crawler_exercise

A simple web crawler built as an interview exercise

Primary LanguagePythonCreative Commons Zero v1.0 UniversalCC0-1.0

crawler_exercise

A simple web crawler built as an interview exercise, using Python 3, asyncio, and aiohttp.

Installation

# (git clone...)
cd crawler_exercise

# create an isolated environment in which to install dependencies
# (this will avoid conflicts with system-wide libraries):
python3 -m venv env && source env/bin/activate

# install dependencies:
python3 -m pip install -r requirements.txt

Running

./main.py http://spacejam.com
# or
./main.py http://spacejam.com --debug

The script will output URLs it visits (not indented) as well as links it finds on those pages (indented).

To run the test:

./test.py