This script is designed to analyze and aggregate the tags used in post titles on the /r/unixporn subreddit. It fetches posts from Reddit, extracts tags from post titles, normalizes them, and then counts their occurrences, outputting the result in a CSV file
- Python 3.x
- PRAW (Python Reddit API Wrapper)
- Pandas
- NumPy
- Matplotlib
- python-dotenv
- Clone the repository or download the source code.
- Install the required Python packages:
pip install -r requirements.txt
- Set up your Reddit API credentials (see the "Configuration" section below).
You need to create a Reddit API application to get your client_id
and client_secret
. Follow these steps:
- Go to Reddit's apps page.
- Click "Create App" or "Create Another App".
- Fill in the details:
- Name: Whatever you prefer
- App type: Choose "script"
- Description: Optional
- About URL: Optional
- Redirect URI: http://localhost:8080 (or another URI of your choice)
- Click "Create app".
- Note down the
client_id
(under the app name) andclient_secret
.
Create a file named .env
in the same directory as the script with the following contents:
CLIENT_ID=your_client_id_here
CLIENT_SECRET=your_client_secret_here
Run the script using Python:
python tag_counter.py
The script will fetch data from the subreddit, process it, and output the results in a file named tag_counts.csv
.
Run the script using Python:
python tag_counter.py
This script will create a horizontal bar graph from the resulting data.
- Fetches top posts from /r/unixporn for each year since the subreddit's creation.
- Extracts and normalizes tags from post titles.
- Counts the occurrence of each tag.
- Outputs the result in a CSV file for easy analysis.
- Create a graph from the resulting CSV file.
You can customize the script to analyze different subreddits or to change the normalization rules by editing the normalize_tag
function.
This project is open source and available under the GNU GPLv3.
Contributions, issues, and feature requests are welcome! Feel free to check issues page.
Special thanks to the Reddit community and the developers of PRAW for making this project possible.