jupyter-server/jupyter-scheduler

Queue of jobs with max concurrency

aiqc opened this issue · 1 comments

aiqc commented

I have hundreds of jobs to run, but my server only has enough resources to run 5 jobs at a time.

It would be nice to create a queue of jobs where 5 are always running.

For context, it's all the same task/logic, just for a different input file.


How do I solve my problem right now?
I have a CSV file that keeps track of which files have/haven't been processed. I run a notebook that fetches 5 unprocessed files and feeds them through a series of multiprocessing Pools. It works, but I wouldn't want the entire queue of jobs in a while loop in case a job fails. So I have to check back every few hours and I am missing out on the chance to launch jobs while I sleep.

Looks like the airflow parity is max_active_runs_per_dag
https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#max-active-runs-per-dag

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉