Twitter Scraper Documentation

Introduction

This Python script is designed to scrape tweets from a specific Twitter user's profile. It utilizes the Selenium library for web automation and scraping, and it requires a web driver for Google Chrome. This documentation provides instructions on how to set up and use the script.

Prerequisites

Before using this script, you need to have the following software and libraries installed:

Python 3.x
Selenium (Python library)
Pandas (Python library)
Chrome web browser
ChromeDriver

Make sure to install the required Python libraries using pip if you haven't already:

pip install selenium pandas webdriver-manager

Additionally, ensure that you have ChromeDriver installed and available in your system's PATH. You can download it from here.

Usage

Clone or download the script to your local machine.
Create a file named passwords.py in the same directory as the script. Inside passwords.py, define your Twitter username and password as follows:

username = "your_twitter_username"
password = "your_twitter_password"

Note: Storing your password in a script is not recommended for security reasons. Consider using alternative authentication methods when building a production-ready application.

Open a terminal or command prompt and navigate to the directory where the script is located.
Run the script
Follow the on-screen prompts in the terminal:
- Enter the Twitter username (without the "@" symbol) for the profile you want to scrape.
- Choose whether you want to export the scraped data to CSV or JSON format by typing csv or json when prompted.
The script will start scraping tweets from the specified user's profile. It will display progress information in the terminal, and when the scraping is complete, it will save the data in the chosen format (CSV or JSON) with the filename based on the Twitter username.

Output

The script will generate an output file in either CSV or JSON format, depending on your choice. The filename will be based on the Twitter username you provided during the execution.

CSV Output: A CSV file containing two columns: user and text. Each row represents a scraped tweet, where user is the Twitter username of the author, and text is the content of the tweet.
JSON Output: A JSON file containing an array of objects, where each object has two properties: user (the Twitter username) and text (the content of the tweet).

Note