Async Rust Web Scraping Project

This project is a Rust-based web scraping application designed to efficiently fetch, parse, and store data from web pages. It utilizes modern Rust async features, robust error handling, and supports multiple database backends.

Features

Asynchronous Web Scraping: Fast data retrieval and processing using reqwest and tokio.
HTML Parsing: Extract data seamlessly using the scraper library.
Flexible Data Storage: Supports SQLite, PostgreSQL, and Redis.
Rate Limiting: Built-in rate limiting to respect website request policies.
Error Handling: Comprehensive error management to handle network and parsing errors gracefully.
Deployment Ready: Containerized setup for easy deployment.
Monitoring: Integrated logging and monitoring setup.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

What you need to install the software:

Rust Programming Language
Cargo (Rust's package manager)
Docker (optional, for containerization)
Access to one of the supported databases (SQLite, PostgreSQL, or Redis)

Installation

Clone the repository

git clone git@github.com:dhodyrev/async-rust-scraper.git
cd async-rust-scraper