Distributed Management Framework Based on Scrapy and Scrapyd.
中文介绍:跟繁琐的命令行说拜拜!Gerapy分布式爬虫管理框架来袭!
Gerapy is developed over Python 3.x. Python 2.x will be supported later.
pip3 install gerapy
After installing Gerapy, you can use command 'gerapy'. If not, check the installation.
Next use this command to initialize the workspace:
gerapy init
Now you will get a folder named gerapy
.
Then cd to this folder, and run this command to initialize the Database:
cd gerapy
gerapy migrate
Next you can runserver by this command:
gerapy runserver
Then you can visit http://localhost:8000 to enjoy it.
Or you can configure host and port like this:
gerapy runserver 0.0.0.0:8888
Then it will run with public host and port 8888.
You can create a configurable project and then configure and generate code automatically.
Also you can drag your Scrapy Project to gerapy/projects
folder. Then refresh web, it
will appear in the Project Index Page and comes to un-configurable, but you can edit this
project in the web interface.
As for the deploy, you can move to Deploy Page. Firstly you need to build your project and add client, then you can deploy the project by clicking button.
After the deployment, you can manage the job in Monitor Page.
Client Management:
Spider Monitor:
Project Management:
Project Edit:
Project Deploy:
Project Configuration:
- Add Visual Configuration of Spider with Previewing Website
- Add Scrapyd Auth Management
- Add Automatic Python&Scrapyd Environment Deployment
- Add Timed Task Scheduler
- Add MongoDB & Redis & MySQL Monitor
- Add Timed Task Scheduler
If you have any questions or ideas, you can join this QQ Group: