A simple web crawler implemented in Python. The logic is to clean up the content and obtain only the weblinks from the landing page. The crawler further hits these weblinks and continues the process until the given depth is reached. SQLAlchemy and BeautifulSoup are the libraries used.
The file crawly.py takes as input two parameters:
- The link to the First/Landing page
- Desired Depth