urls-scrapper 🔗

BA Internship preferences task #3

A tool for extracting the URLs from either of www.curlie.org or www.wikipedia.org recursively (i.e., including sub-pages).

Overview 🗒️

URLs Scrapper is a simple tool that prompts the user for a URL and a specific depth, and it returns all the links within that page and the subpages until the specified depth, in an asynchronous matter to speed up the process.

What my code offers 🤓

multiple depth search (determined by user)
more websites other than the 2 required in the task
simple GUI for testing the protoype

Included Files 📂

scrapperScript.py Bare Script
gui.py GUI version for the tool

Used Programming Language 💻

Python

Used Libraries ➕

The tool utilizes the following libraries:

bs4: for web scraping and parsing HTML and XML documents
asyncio: for asynchronous programming
aiohttp: for asynchronous http requests
tkinter: for GUI

author

Mariam Atef - July 2023

MariamAtef226/urls-scrapper