CGRU/cgru

Low priority services

lithorus opened this issue · 6 comments

Issue :
We have render nodes that have dual purposes, depending on the time of day and need. Specifically we have nodes that in the day time are dedicated to Nuke renders, but after work hours (or whenever people stop submitting jobs) it needs to fall back to doing CG renders.

Current solution :
Either have them stay dedicated or have a service that enables/disables the service at specific times of the day.

Proposed solution :
The proposal is to have "Low Priority" services than can be ejected and disabled with a specific command to the render node. Then after the render node have been idle for a certain period, they will automatically be enabled again.

Other render managers have the ability to abort jobs which are not a specific service type, but constantly aborting jobs can create incomplete renders. If we also disable the services after the jobs are aborted we can make this a minimum.

Hi!
I see. We have the same issue. And now we enable/disable services "by hands" too.
It is a big issue. For now service is just a sting (and services description is a list of strings).
To handle this issue, service should be a class, with such fields as name, priority at least.
Plus time_disabled, disable_period and may be more.

( This implementation needs much time. )

I'm happy to help with it. Probably needs more discussion on how to implement it.

One way of solving is to have a way ejecting task based on a service name filter. If it's regex it can both be used for exclusion or inclusion. Combined with the create job event it can achieve the same thing with custom event scripts, but not as clean.

To add pool operation to eject tasks by service, where we can pass a regex is not so big deal.
May be we can create some secondary services list on pool. To automate process.

Yes, this issue needs more thinking.

So similar to disabled services on pool/render we could have a list for low priority?

Yes.
But probably just std::list<std::string> will not be enougth, it should be some structue/class with a service name and some time(seconds) fields such as time_disabled, disable_timeout and may be plus some others (behaviour flags).

But if we create a list of low priority services classes, there will be 3 lists: common services, disabled services and new low priority services classes.
Maybe better replace a common services std::list<std::string> property with this new list of classes.
That class can have disabled flag, and we can get rid of disabled services list and have just one service list.
Maybe it will be a more simple and plain solution?