scrapy/scrapyd

Persistent scheduling based on CRON

kanchansapkota27 opened this issue · 4 comments

Is it possible to extend the project with one extra endpoint for scheduling say POST http:localhost:6800/cronschedule.json with extra args like cron_exp also with db_url as setting in scrapyd.cfg ?

May be it can be implemented with APScheduler Link which provides TwistedScheduler Link

Is it possible by modifiying scrapyd codes?

PS: I am beginner and just looking for possibilities.

I think a simple solution is to create a cronjob on another machine that sends requests to schedule.json.

Yes I have come across solutions that do what you have suggested. I was just looking at the possibility that we could just implement the feature in a single scrapyd service rather that implementing another on top of it just for periodic persistent schedules.

cron is available in pretty much every Linux distribution, so in this case I think it's fine to use multiple tools, rather than put all the features into one monolith.

cron is available in pretty much every Linux distribution, so in this case I think it's fine to use multiple tools, rather than put all the features into one monolith.

I think i will go with that thank you.