- Create a file similar to
arxiv.cat
, including every types you are targeting. - Execute
./run.sh < your_arxiv.cat
And each category will be stored in <category>.json
.
scrapy crawl arxiv -o $type.json
Then input the category and index range you want to crawl (default: 0 - 20000) as asked.
- scrapy (python3)