Page 1 of 1

AFrender quiting.

Posted: Mon Feb 27, 2017 7:04 pm
by seven11
Hi Timur,
We upgraded to 2.2.1 and noticed that on our remote render machines which are in another office that AFrender will quit.
To give you an idea of the topology, We have a main office and a remote office connected via a tunnel through the internet. The tunnel can
and does go up and down any time due to outages. The AFrender in the remote office most of the time reconnects back to the AFserver in the main office.
But sometimes the AFrender just quits. Here's the output from the AFrender log:

A good reconnect looks like this:

AFERROR: msgsendtoaddress: connect failure for msgType 'TRenderUpdate':
10.0.0.26:51000: Operation now in progress

Thu 23 Feb 05:00.44: INFO Connection lost count = 1 of 3
AFERROR: msgsendtoaddress: connect failure for msgType 'TRenderUpdate':
10.0.0.26:51000: Operation now in progress

Thu 23 Feb 05:00.57: INFO Connection lost count = 2 of 3
AFERROR: msgsendtoaddress: connect failure for msgType 'TRenderUpdate':
10.0.0.26:51000: No route to host

Thu 23 Feb 05:01.04: INFO Connection lost count = 3 of 3
AFERROR: msgsendtoaddress: connect failure for msgType 'TRenderUpdate':
10.0.0.26:51000: No route to host

Thu 23 Feb 05:01.07: INFO Connection lost count = 4 of 3
Thu 23 Feb 05:01.07: WARNING Render connection lost, trying to reconnect...
Thu 23 Feb 05:21.37: INFO Reconnected to the server
Thu 23 Feb 05:21.37: INFO Render registered.

####################
A bad reconnect looks like this:
AFERROR: msgsendtoaddress: connect failure for msgType 'TRenderUpdate':
10.0.0.26:51000: Operation now in progress

Sat 25 Feb 23:17.58: INFO Connection lost count = 1 of 3
AFERROR: msgsendtoaddress: connect failure for msgType 'TRenderUpdate':
10.0.0.26:51000: Operation now in progress

Sat 25 Feb 23:18.11: INFO Connection lost count = 2 of 3
AFERROR: msgsendtoaddress: connect failure for msgType 'TRenderUpdate':
10.0.0.26:51000: Operation now in progress

Sat 25 Feb 23:18.24: INFO Connection lost count = 3 of 3
AFERROR: msgsendtoaddress: connect failure for msgType 'TRenderUpdate':
10.0.0.26:51000: No route to host

Sat 25 Feb 23:18.28: INFO Connection lost count = 4 of 3
Sat 25 Feb 23:18.28: WARNING Render connection lost, trying to reconnect...
Sat 25 Feb 23:18.37: INFO Reconnected to the server
Sat 25 Feb 23:18.37: ERROR Render with this hostname 'efile01' already registered.
Sat 25 Feb 23:18.37: INFO Exiting render.

Why is it saying "Render with this hostname 'efile01' already registered."?

Thanks,
Scott

Re: AFrender quiting.

Posted: Mon Feb 27, 2017 9:21 pm
by timurhai
Hi.

May be afserver "thinks" that render is still online.
afrender client becomes offline after 60 seconds by default, and can be adjust in a config:
"af_render_zombietime":60
If render tries to connect (re-connect), and server has an online render with the same name, server sends a signal to render to exit.
Try to set "af_render_zombietime":10 seconds and af_render_connectretries":9 to the same value or a little less.
May it will be better in your situation.

Re: AFrender quiting.

Posted: Mon Feb 27, 2017 9:34 pm
by timurhai
Sorry.
Try to set "af_render_zombietime":10 seconds and af_render_connectretries":11 to the same value or a little more.
This way server will treat render as offline before the render sends register message.

Re: AFrender quiting.

Posted: Mon Feb 27, 2017 10:00 pm
by seven11
Just changed the two parameters. I'll see how they go.
Thanks Timur,
Scott

Re: AFrender quiting.

Posted: Mon Mar 27, 2017 12:24 pm
by selsner
I am seeing the exact same problem right now with one of our remote offices.

Re: AFrender quiting.

Posted: Mon Mar 27, 2017 1:11 pm
by timurhai
Have you tried:
"Try to set "af_render_zombietime":10 seconds and af_render_connectretries":11 to the same value or a little more"
- ?

Re: AFrender quiting.

Posted: Mon Mar 27, 2017 1:36 pm
by selsner
Will try.