/easy-hacker-news-stream

The Easiest Way to Create a Hacker News Feed using Python and JavaScript - We’ve all been sitting in the back of a CS lecture or class and looked up from our laptop to actually listen and taken a quick peek around at everyone’s laptops. More likely than not, quite a few of those screens were displaying the all too familiar Hacker News orange. While maybe we should all pay attention more to the speaker, it seems new, cool news always takes precedence. So what if you were determined to never miss a single article? Or what if you wanted to get every update from the site and automate based off that new information? By leveraging the power of PubNub’s real time global network and scraping a little RSS everyone will never miss a new Hacker News article again.

Primary LanguagePython

The Easiest Way to Create a Hacker News Feed using Python and JavaScript

We’ve all been sitting in the back of a CS lecture or class and looked up from our laptop to actually listen and taken a quick peek around at everyone’s laptops. More likely than not, quite a few of those screens were displaying the all too familiar Hacker News orange. While maybe we should all pay more attention to the speaker, it seems new, cool news always takes precedence. So what if you were determined to never miss a single article? Or what if you wanted to get every update from the site and automate based off that new information? By leveraging the power of PubNub’s real time global network, and scraping a little RSS, everyone will never miss a new Hacker News article again. If you want to see it working live, there is a quick and dirty demo you can see here. It uses the JavaScript Pubnub SDK and will display the updates to the Hacker News feed. To see it in action locally, clone the source from Github and run the Python scraper from the command line.

RSS Scraping

The first task is to grab the RSS feed from Hacker News. There is a plethora of ways to do this and you can quickly write your own rss scraper if you want, but I decided to use Python and feedparser. With a quick “pip install feedparser” we have our RSS.

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
// Get Hacker News Rss
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
def current_hn(rss):
    return feedparser.parse(rss)

There is lots of information you will get in this feed and if you want, take it all. However, I decided the most interesting information was the rank of the post, title of the post, the link to the article, and the comments link.

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
// Store interesting information
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
message = []
for index, entry in enumerate(rss.entries):
    post = {}
    post["rank"] = index + 1
    post["title"] = entry.title
    post["link"] = entry.link 
    post["comments"] = entry.comments
    message.append(post)

Python Command Line

The Python Argparse module is used, which very powerfully gives you robust command line options.

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
// Argparse
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
parser = ArgumentParser(description="Options to parse RSS Feed")
parser.add_argument("-t", "--time", dest="time_to_wait", type=int, default=10)

You can python hn.py --help to see descriptions of all the options from the command line. The Python module gives you options for specifying how often you want to poll Hacker News for changes and if you want to get a new page after every change to the site or just the new posts that appear on the site. For instance, if you wanted to poll every five seconds and get the entire page you could run to be up and going:

python hn.py --mode entire --time 5

Argparse also gives defaults, so run the following to use the defaults:

python hn.py

Go Global

Now that we have the information that is important to us, and know how to run the scraper locally, it's time to send it global. PubNub provides our incredibly simple API to publish the message. Quickly “pip install Pubnub” and publish our information from Hacker News.

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
// Publish to PubNub
// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
pubnub.publish({
    channel : "hacker-news",
    message : message
})

Now it’s up to you. PubNub offers over 50 different SKD’s for your use. Take your pick. When trying to consume the information simply subscribe to the channel (in our case “hacker-news”) and you’re off. There are publically available demo publish and subscribe keys to use.

My Hacker News

Additional Resources

If you want to dive further into PubNub, we have lots of tutorials and walkthroughs. Happy Hacking.