Page 1 of 1

Need a more granular way to define capacity?

Posted: Fri Jul 21, 2017 11:00 pm
by avclubvids
With each render, there is a different limiting factor. Sometimes it uses a lot of CPU, RAM, or VRAM, or the files are very large so network I/O is the limiting factor. With a single value defining what a render node is capable of, it is difficult for artists to determine what their render is going to need in terms of capacity. Some After Effects renders can be run with 2 or 3 instances per machine, while other renders require 100% of a given system resource. We have seen little to no impact running a CPU-intensive task locally (like a simulation) while CGRU renders a GPU-intensive task at the same time on the same machine. But with "capacity" defined as a single number, we cannot tell CGRU that this is ok, as it has no way of knowing that one task uses 100% GPU and the other 100% CPU, so there's nothing preventing 2 GPU-intensive tasks on the same node, which would cause a slowdown at best but more likely will freeze the machine.

It seems like CGRU needs a better way of understanding a render node's real capacity, perhaps a CPU, GPU, and RAM capacity?

Re: Need a more granular way to define capacity?

Posted: Sun Jul 23, 2017 6:27 pm
by timurhai
For such cases i have some plan.
Create an ability to define any number of custom pairs string,number and for job (block) and for render(s).
This way we can define that some render has cpu:400, gpu:100, mem:64 and some task as: cpu:300, gpu:1 and mem:1 will mean that only cpu will be used. But for now have lots of other things to implement.

We have tasks that consumes different resources too.
For now we solve it this way. Set low capacity value and low max tasks per host (mph).
So we limit tasks not by capacity, but by mph. So 2 different jobs will run parallel.
( i know that this solution is much worst and less flexible than described above )