crawl all GEO metadata, features:
- crawl platforms
- crawl samples
- crawl series
- incremental crawling
- missed crawling
Table of Contents
pip install geo-spider
geo-spider saves files in jsonlines form, Refer to this site for details.
geo-spider default generate logs to geo-spider.log(current directory)
in WARNING level, you can customize by -d
and -l
options.
-d
to enable debug mode-l
specify customized log file
geo-spider -d -l new-geo-spider.log <sub-command>
geo-spider platforms -o platforms.jl
If you have a crawled platforms jsonlines file:
geo-spider platforms -cf platforms.jl -o new-platforms.jl
If you have multiple platforms jsonlines files:
geo-spider platforms -cd platforms -o new-platforms.jl
Specify -cf
or -cd
like incremental crawling, add a -m
option.
geo-spider platforms -cf platforms.jl -m missed -o new-platforms.jl
geo-spider samples -o samples.jl
geo-spider samples -pcf platforms.jl -cf samples.jl -o new-samples.jl
geo-spider samples -pcf platforms.jl -cf samples.jl -m missed -o new-samples.jl
geo-spider series -o series.jl
geo-spider series -pcf platforms.jl -scf samples.jl -cf series.jl -o new-series.jl
geo-spider series -pcf platforms.jl -scf samples.jl -cf series.jl -m missed -o new-series.jl