Agents start working with delay

General discussions and questions.
Post Reply
victor
Posts: 22
Joined: Mon May 15, 2017 9:53 am

Agents start working with delay

Post by victor »

Hi,
We tried to use 50 agents for very fast (after effects) rendering.
Unfortunately it seems (in afwatch) the agents start working after a delay and thus the total rendering time is not improving (compared to 16 agents).
In afwatch we clearly see most agents don't start at least 40 seconds from the starting point.

Any thoughts?

Thx
User avatar
timurhai
Site Admin
Posts: 911
Joined: Sun Jan 15, 2017 8:40 pm
Location: Russia, Korolev
Contact:

Re: Agents start working with delay

Post by timurhai »

Hi.
There can be lots of reasons of various delays.
What do you mean by "delay"?
- You start afrender but it appears in afwatch with a delay?
- You have sand a job, but it dispatch tasks with a delay?
- Task started but AE process appers with a delay?
- AE process appeared but it start to produce result with a delay?
- and so on...

Also you should specify CGRU version, afserver and afrender OS.
Timur Hairulin
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
victor
Posts: 22
Joined: Mon May 15, 2017 9:53 am

Re: Agents start working with delay

Post by victor »

CGRU vesrion: 2.2.1

The flow is as following:
1. setup
1.1 master is up (afserver)
1.2 master UI is up (afwatch)
1.3 50 agents are up (afrender) - we run 50 machine (each runs afrender as a service)
1.4 We see all 50 agents are connected to the master in the UI
2. execute afjob.py with parameters (to trigger the distributed rendering process)
3. we see in [Jobs] (of afwatch) new job (as expected)
4. double click on the job -> we see all tasks

At this point we expect all tasks will start very quickly since we have 50 free agents (afrenders). And the rendering will take 1 min (total time)
Instead, some tasks start immediately while most of the tasks start with very big delay (more than 30 seconds).
Some tasks start after 1 min.

The overall result: total rendering time is 2 min.

NOTES: When using 16 agents (afrenders) we get the same result: 2 min

Conclusion: we cannot make our rendering faster then 2 min even when adding more agent machines.
User avatar
timurhai
Site Admin
Posts: 911
Joined: Sun Jan 15, 2017 8:40 pm
Location: Russia, Korolev
Contact:

Re: Agents start working with delay

Post by timurhai »

Hi.
So you send a job. Double click it in an AfWatch to see all its tasks:
1. Some tasks are a in "READY" state, while there are some free renders that are ready to task tasks?
2. Tasks are in "RUN" state, but progress starts to grow with some delay?
So i want to find out is there ready tasks and ready renders at the same time?
Can you also look at AfWatch "Renders" tab to see what is going on there.
Timur Hairulin
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
victor
Posts: 22
Joined: Mon May 15, 2017 9:53 am

Re: Agents start working with delay

Post by victor »

1. I send a job. Double click it in an AfWatch to see all its tasks:
Tasks are in "RUN" state, but progress starts to grow with delay (sometimes more than 30 seconds)
2. AfWatch "Renders" - I see all 50 renders are connected
User avatar
timurhai
Site Admin
Posts: 911
Joined: Sun Jan 15, 2017 8:40 pm
Location: Russia, Korolev
Contact:

Re: Agents start working with delay

Post by timurhai »

It can be a network bandwidth problem.
Your composition should download sources, and then, upload result.
Try to measure traffic from each machine and a maximum network speed.
This is can be your bottleneck for render farm scaling.
Timur Hairulin
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
victor
Posts: 22
Joined: Mon May 15, 2017 9:53 am

Re: Agents start working with delay

Post by victor »

Ok, I will check this direction and come back.
Thx
victor
Posts: 22
Joined: Mon May 15, 2017 9:53 am

Re: Agents start working with delay

Post by victor »

You were right!!!!
The bottleneck is network or shared folder we use.

I have tested the same case with all input and output resources locally (render machines use only local resources) and it works much faster.
It also starts immediately.

Thank you very much.

Pobeda Afanasia ne izbejna :)
Post Reply