Modyev/WebsiteCrawler

Web Crawler written in C# that parses all urls from a specific page then recursively visits them while parsing all links available on that webpage

C#

🌐 Recursive Web Crawler in C#

A simple web crawler in C# that recursively explores links from a webpage!

📋 Features

🔍 Parse Links: Starts by parsing all links from a given URL.
🔄 Recursive Crawling: Visits each parsed link and extracts further links until the maximum limit is reached.
🔗 Link Extraction: Uses regular expressions to extract URLs from the page content.
🛡 Duplicate Protection: Maintains a HashSet of visited URLs to avoid revisiting and prevent infinite loops.
🎯 Customizable: Set the maximum number of URLs to crawl.

🚀 How It Works

Start Crawling: Specify the starting URL.
Extract Links: The program fetches the content of the page and extracts all links.
Recursive Visits: It recursively visits those links, repeating the process.
Stop Condition: Crawling continues until the defined maximum number of URLs is reached.