Mastodon Social Platform Scraper

Overview

The Mastodon Social Platform Scraper is a Python-based web scraping tool designed to explore and extract valuable data from the Mastodon social platform. Leveraging the Scrapy framework for structured data extraction and Selenium for dynamic content handling, this project provides a comprehensive solution for harvesting information from Mastodon's explore page.

Key Features

Hashtag Scraper: Extracts trending hashtags on Mastodon, providing insights into popular topics.
News Scraper: Collects news data from the explore page, facilitating the analysis of current events.
Timeline Scraper: Dynamically scrolls through the timeline, scraping post details and reactions for a holistic view of user activity.
Efficient Data Management: Utilizes Pandas for organized and efficient storage of scraped data.

Requirements

Python 3.x
Scrapy
Selenium
Chrome WebDriver

Getting Started

Clone the Repository:

git clone https://github.com/Muneeb1030/WebScrapper_Mastodon.git

Install Dependencies:

pip install scrapy selenium pandas requests

Set Chrome WebDriver Path: Update the chrome_driver_path variable in the code with the path to your Chrome WebDriver.
Run the Scraper:
```
scrapy crawl mastodon
```

Additional Information

Customization:
- Tailor the scraper to your needs by modifying the Scrapy spiders.
GitHub Repository:
- Explore, contribute, and stay updated on the GitHub repository.

Disclaimer

This project is intended for educational purposes and strictly adheres to Mastodon's terms of service. Users are advised to deploy the scraper responsibly and in compliance with platform policies.

Additional Resources

Explore the project in detail through my Medium blog, where I share insights, motivation, and in-depth explanations about the Mastodon Social Platform Scraper.

Contributors

M Muneeb ur Rehman

Feel free to fork, contribute, and enhance the capabilities of this Mastodon scraper. Happy scraping! 🌐💻