My-Little-Crawler
The little crawler is still growing.
Now, can-dos:
- read robots.txt of a certain website
- search the whole internet using search engines( may cause copyright or IP problem; planning to delete or modify relevant codes)
- download binary contents to a certain path with the function of rename
- look up for a location of certain IP address, default for local internet environment
- look for links in a certain page( I will emphasize on this part later)
Just some basic tricks, I need to work hard.
2017.7.18:
program updated, getHTML function deleted
2017.8.31:
program updated, now we can download captcha pictures and login websites with captcha by keeping the cookies