Page 1 of 1

Auto-maximize render capacity

Posted: Fri Jul 21, 2017 9:59 pm
by avclubvids
There's a trick that we've used in other render managers like Deadline that would be very useful in CGRU:

As a job with multiple frames per task comes close to completion, there is often a state where render nodes begin to go idle as the final tasks are completed by only a few machines. In these instances, we have had great success with creating a script that resets the task chunk size so that the remaining frames are divided dynamically between the available nodes. What this looks like in practice is a final task of 100 frames being rendered by 1 machine can become 5 tasks of 20 frames rendering on 5 machines, etc.

The end result is greater utilization of resources and renders that make it across the finish line faster.

An ideal version of this for CGRU would be the ability for the server to re-task a running node to change the list of frames it is currently assigned, to re-allocate those frames amongst other nodes. That way no frames would have to be re-rendered by re-queuing a task, the server could just dynamically re-allocate frames amongst all of the available render nodes.

You might want this to be a per-job option, as there will be issues with some software running this way, but it's a great way to get a render out in a hurry.

Re: Auto-maximize render capacity

Posted: Sun Jul 23, 2017 4:38 pm
by timurhai
Hi.
I agree with you and see how it can help.
But realization is not so clean for me.
Tasks are already allocated at job registration.
For numeric blocks, if we change frames per task, tasks number will be changed.
For now tasks number in Afanasy is not changeable.
Also if we change tasks number we should recalculate tasks progress.
What we should do with completed tasks logs, outputs?
It is a very complex issue.

If task is enough "heavy" we already prefer to set frames per task parameter to 1.

It will be much more simple to divide frames per task (fpt) not in linear way.
For example first half task can have 100 fpt, next block 10, and last just 1.

Re: Auto-maximize render capacity

Posted: Sun Jul 23, 2017 10:35 pm
by avclubvids
I can understand that it might be difficult to implement, perhaps there is a temporary solution that does not require as much work – in situations as described where there are idle machines but tasks still rendering, you could have an optional system wherein:

The remaining tasks get re-submitted as a new render job, with less frames per task and with any currently complete frames skipped.
If the machines go idle again, perhaps after a specified timeout, then the process repeats again until the frames are all rendered.

Obviously, this is not the very best way to do this, but with a "skip existing" option this would work well and logs would be preserved etc.

Re: Auto-maximize render capacity

Posted: Sun Nov 19, 2017 10:27 pm
by Strob
It would be great if you could do what you said there, nice idea!: : "It will be much more simple to divide frames per task (fpt) not in linear way.
For example first half task can have 100 fpt, next block 10, and last just 1."