This helps to grasp and analyze information from search engine.
git clone https://github.com/TANSixu/SimpleCrawler.git
conda create -n [&choose_your_own_name&] python=2.7
conda activate [&choose_your_own_name&]
pip install -r requirements.txt
- Go into source code simplecrawler.py, set
$kw$ at line 20 to the search keyword. (use this ugly way to support kanji search across all platform) - Run the following command:
python simple_crawler.py
- Optional arguments:
-h, --help show this help message and exit
-n NUM, --num NUM number of pages to craw
-d DIR_NAME, --dir_name DIR_NAME directory name to save the crawled files