Auto-maximize render capacity
Posted: Fri Jul 21, 2017 9:59 pm
There's a trick that we've used in other render managers like Deadline that would be very useful in CGRU:
As a job with multiple frames per task comes close to completion, there is often a state where render nodes begin to go idle as the final tasks are completed by only a few machines. In these instances, we have had great success with creating a script that resets the task chunk size so that the remaining frames are divided dynamically between the available nodes. What this looks like in practice is a final task of 100 frames being rendered by 1 machine can become 5 tasks of 20 frames rendering on 5 machines, etc.
The end result is greater utilization of resources and renders that make it across the finish line faster.
An ideal version of this for CGRU would be the ability for the server to re-task a running node to change the list of frames it is currently assigned, to re-allocate those frames amongst other nodes. That way no frames would have to be re-rendered by re-queuing a task, the server could just dynamically re-allocate frames amongst all of the available render nodes.
You might want this to be a per-job option, as there will be issues with some software running this way, but it's a great way to get a render out in a hurry.
As a job with multiple frames per task comes close to completion, there is often a state where render nodes begin to go idle as the final tasks are completed by only a few machines. In these instances, we have had great success with creating a script that resets the task chunk size so that the remaining frames are divided dynamically between the available nodes. What this looks like in practice is a final task of 100 frames being rendered by 1 machine can become 5 tasks of 20 frames rendering on 5 machines, etc.
The end result is greater utilization of resources and renders that make it across the finish line faster.
An ideal version of this for CGRU would be the ability for the server to re-task a running node to change the list of frames it is currently assigned, to re-allocate those frames amongst other nodes. That way no frames would have to be re-rendered by re-queuing a task, the server could just dynamically re-allocate frames amongst all of the available render nodes.
You might want this to be a per-job option, as there will be issues with some software running this way, but it's a great way to get a render out in a hurry.