將乃木坂46官方部落格的文章整理成json
格式的檔案,同時下載所有圖片檔,供備份即將畢業的成員部落格使用。
本專案包含:
./crawler/
: 爬蟲程式碼./demo-site/
: Github Pages上的範例網站
Because the whole demo site is hosting on github pages, there are lots of files in the gh-pages
branch. You will take long time to clone and waste the disk space.
If you only need the source code, you can just clone the master branch.
$ git clone https://github.com/janelin612/n46-crawler.git --single-branch