My-Little-Crawler

The little crawler is still growing.

Now, can-dos:

  1. read robots.txt of a certain website
  2. search the whole internet using search engines( may cause copyright or IP problem; planning to delete or modify relevant codes)
  3. download binary contents to a certain path with the function of rename
  4. look up for a location of certain IP address, default for local internet environment
  5. look for links in a certain page( I will emphasize on this part later)

Just some basic tricks, I need to work hard.

2017.7.18:

program updated, getHTML function deleted

2017.8.31:

program updated, now we can download captcha pictures and login websites with captcha by keeping the cookies