A tool for extracting the URLs from either of www.curlie.org or www.wikipedia.org recursively (i.e., including sub-pages).
URLs Scrapper is a simple tool that prompts the user for a URL and a specific depth, and it returns all the links within that page and the subpages until the specified depth, in an asynchronous matter to speed up the process.
- multiple depth search (determined by user)
- more websites other than the 2 required in the task
- simple GUI for testing the protoype
- scrapperScript.py Bare Script
- gui.py GUI version for the tool
The tool utilizes the following libraries:
- bs4: for web scraping and parsing HTML and XML documents
- asyncio: for asynchronous programming
- aiohttp: for asynchronous http requests
- tkinter: for GUI
Mariam Atef - July 2023