/urls-scrapper

URLs Scrapper is a simple tool that prompts the user for a URL and a specific depth, and it returns all the links within that page and the subpages until the specified depth, in an asynchronous matter to speed up the process.

Primary LanguagePython

urls-scrapper 🔗

BA Internship preferences task #3

A tool for extracting the URLs from either of www.curlie.org or www.wikipedia.org recursively (i.e., including sub-pages).

Overview 🗒️

URLs Scrapper is a simple tool that prompts the user for a URL and a specific depth, and it returns all the links within that page and the subpages until the specified depth, in an asynchronous matter to speed up the process.

What my code offers 🤓

  • multiple depth search (determined by user)
  • more websites other than the 2 required in the task
  • simple GUI for testing the protoype

Included Files 📂

  • scrapperScript.py Bare Script
  • gui.py GUI version for the tool

Used Programming Language 💻

Python

Used Libraries ➕

The tool utilizes the following libraries:

  • bs4: for web scraping and parsing HTML and XML documents
  • asyncio: for asynchronous programming
  • aiohttp: for asynchronous http requests
  • tkinter: for GUI

author

Mariam Atef - July 2023