Stack Exchange is a network of Q&A
websites covering various fields. Users earn reputation and score tag points based on the tags of the questions and answers they involve themselves with. Each user has a tag section under his/her profile page that lists the tag names and the respective counts. The Python scripts in this repository parse and extract the tag names and scores, which could then be fed to wordcloud module for Python to produce a word cloud image with tags being the words and their respective sizes being proportional to the respective scores. The scripts could extract such information from all Stack Exchange Q&A sites, including of course it's biggest Q&A
site
Stack Overflow.
Let's take Stack Overflow's highest reputation user Jon Skeet as the sample. His profile page link has the ID : 22656
. So, a minimal Python script to generate his tag-cloud would be -
>>> from stackoverflow_users_taginfo import tag_cloud
>>> tag_cloud(link = 22656)
Giving it more options, here's a tag-cloud with the first 1000
tags on a 4K canvas
being produced using example_extensive.py -
As a demo on extracting tag information and generating tag-cloud from other Q&A sites, here's a tag-cloud of Jon Skeet's meta.stackexchange profile generated with example_extensive2.py -
We are living in 8K
age, so here's Jon Skeet's 1000
tags on 8K
canvas!
- Python 2.x or 3.x.
- Python modules : NumPy, Requests, urllib.
- BeautifulSoup - To extract html information. Works with version 4.4.1, might work with older versions too, but not tested.
- Word_cloud - Word cloud creation : Version 1.3.1 or newer.