SLMT/dcard-crawler

NullPointerException while fetching a list of posts

Closed this issue · 3 comments

SLMT commented

When the following command is used in the program:

java -jar dcard-crawler.jar fetch-post -e -f PHOTOGRAPHY 10 photos

It gives the following output:

九月 12, 2016 2:15:14 下午 slmt.crawler.dcard.downloader.DcardPostDownloader <init>
資訊: creates a new directory at photos
九月 12, 2016 2:15:14 下午 slmt.crawler.dcard.downloader.DcardPostDownloader setTargetForum
資訊: only download the posts in forum: PHOTOGRAPHY
九月 12, 2016 2:15:14 下午 slmt.crawler.dcard.downloader.DcardPostDownloader onlyWithImage
資訊: only download the posts with images: true
九月 12, 2016 2:15:14 下午 slmt.crawler.dcard.downloader.DcardPostDownloader downloadPosts
資訊: retrieving the list of first 30 posts of PHOTOGRAPHY forum
Exception in thread "main" java.lang.NullPointerException
        at slmt.crawler.dcard.downloader.DcardPostDownloader.downloadPosts(DcardPostDownloader.java:96)
        at slmt.crawler.dcard.downloader.DcardPostDownloader.downloadPosts(DcardPostDownloader.java:73)
        at slmt.crawler.dcard.action.FetchPostAction.execute(FetchPostAction.java:79)
        at slmt.crawler.dcard.action.TopAction.execute(TopAction.java:44)
        at slmt.crawler.dcard.DcardCrawler.main(DcardCrawler.java:12)

It seems like there is a NullPointerException while fetching a list of posts.

SLMT commented

After a little bit checking, I found that when the program is trying to connect to the Dcard API, it only gets 403 response. However, I can still connect to the API via Firefox.

Here is the URL I tested: https://www.dcard.tw/_api/forums/photography/posts?popular=false. This is also the URL used by the program.

I am not sure why this happened. It needs some deeper investigation.

SLMT commented

I found the problem. According to this Stackoverflow post, it seems like we should set User-Agent of the HTTP header.

I tried this solution, and it works !
I will fix it and release a new version ASAP.

SLMT commented

This bug has been fixed in version 0.2.1. Please check out:
https://github.com/SLMT/dcard-crawler/releases/tag/v0.2.1