/syn-spider

web crawler cases

Primary LanguageJavaScript

syn-spider

This is a repository for some simple web-crawler cases during my learning period.

Features

  • write script in node.js
  • use superagent for web request
  • use nedb for persistent storage

Samples

  1. www.tctasia.cn: a basic demo using native http module to get content from a chinese-encoded website.
  2. www.rci.com: a demo using superagent to crawl deep content.
  3. www.zhihu.com: a demo simulting user login.
  4. www.itnint.com: a demo mirroring the website.