This project is just for fun! Use it with caution!
This is project about crawling the BibTex on google scholar automatically with paper titles given
.
├── bibtex.txt
├── CrawlingBibtex
│ ├── __init__.py
│ ├── items.py
│ ├── middlewares.py
│ ├── pipelines.py
│ ├── settings.py [bibtex save path's configuration]
│ └── spiders
│ ├── fetchscholar_spider.py
│ ├── __init__.py
│ ├── __pycache__
│ └── utils.py
├── main.py
├── papers.csv [papers' titles configuration]
├── README.MD
└── scrapy.cfg
- python: 3.8
- Scrapy: 2.4.1
each of the papers' titles lays in the papers.csv file as single line, if you want to ignore some line, just comment it by '#'
just execute follow command
python main.py
Notice you may need to configure the proxy in the settings.py
USE_PROXY = True
HTTP_PROXY = {'proxy': 'http://127.0.0.1:1082'}