Parser Error not working as expected
Parser Error not working as expected
Hello,
I'm having an issue with the error thrown by a python parser:
In my implemented nuke parser class I'm setting self.error=True. As I understood it, afanasy should then get a Parser Bad Result (PBR) error and restart this task. In my case, the task stays on the state RUN, but in the background the render process is killed. afrender output gives me: "Error: Parser bad result. Stopping task."
But instead of restarting the task, nothing happens (state stays on RUN) and after waiting some time, the afserver runs into the af_task_update_timeout threshold. Afrender tries to stop the Task, but fails with the message
AFERROR: RenderHost::stopTask: 0 tasks, no such task
TP: j6 b0 t4
Then after waiting some more the server runs into af_task_stop_timeout.
This is some strange behaviour, I think. Can you reproduce that? What am I missing here? How do I properly throw an error from the python parser?
CGRU version 2.2.0 / Python version 2.6.6 / GCC version 4.4.7
I'm having an issue with the error thrown by a python parser:
In my implemented nuke parser class I'm setting self.error=True. As I understood it, afanasy should then get a Parser Bad Result (PBR) error and restart this task. In my case, the task stays on the state RUN, but in the background the render process is killed. afrender output gives me: "Error: Parser bad result. Stopping task."
But instead of restarting the task, nothing happens (state stays on RUN) and after waiting some time, the afserver runs into the af_task_update_timeout threshold. Afrender tries to stop the Task, but fails with the message
AFERROR: RenderHost::stopTask: 0 tasks, no such task
TP: j6 b0 t4
Then after waiting some more the server runs into af_task_stop_timeout.
This is some strange behaviour, I think. Can you reproduce that? What am I missing here? How do I properly throw an error from the python parser?
CGRU version 2.2.0 / Python version 2.6.6 / GCC version 4.4.7
Re: Parser Error not working as expected
Hi.
If you have af_task_update_timeout and af_task_stop_timeout, this means that afrender stopped to send any info about the task.
So after parser "self.error=True" afrender killed this task, and just forget about the task, it did not send to afserver that is PBR, and server still thinks that it is RUN.
If it is so, it is a bug.
But we need to catch it.
I always testing on a dummy task:
https://github.com/CGRU/cgru/tree/maste ... %20scripts
There is also a job.py that sends such task.py.
It can simulate various output for parsers, generate images for thumbnails.
And i just tested PBR and it works.
The best wy to catch a bug is to write the simples task.py and a job.py that sends it.
If i can reproduce it, i can debug it.
If you have af_task_update_timeout and af_task_stop_timeout, this means that afrender stopped to send any info about the task.
So after parser "self.error=True" afrender killed this task, and just forget about the task, it did not send to afserver that is PBR, and server still thinks that it is RUN.
If it is so, it is a bug.
But we need to catch it.
I always testing on a dummy task:
https://github.com/CGRU/cgru/tree/maste ... %20scripts
There is also a job.py that sends such task.py.
It can simulate various output for parsers, generate images for thumbnails.
And i just tested PBR and it works.
The best wy to catch a bug is to write the simples task.py and a job.py that sends it.
If i can reproduce it, i can debug it.
Timur Hairulin
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
Re: Parser Error not working as expected
Hi!
I tested it with a simple job sent by job.py: When using the generic service and parser the job finished normally. Then I added just one line of code in the "do"-Function of the parser: self.error = True -> now the job still behaves normally and shows his error.
Then I switched back to Nuke, submitted a simple nuke job and added the same line to the nuke parser. After starting, the same unexpected behavior occurs again as described before. In mantra it's the same situation by the way.
So I think the problem doesn't lie in the Parser code, but might be the difference between a "simple" python job and Nuke/Mantra. Any chance that you can reproduce this behavior (e.g. with a Nuke job)?
I tested it with a simple job sent by job.py: When using the generic service and parser the job finished normally. Then I added just one line of code in the "do"-Function of the parser: self.error = True -> now the job still behaves normally and shows his error.
Then I switched back to Nuke, submitted a simple nuke job and added the same line to the nuke parser. After starting, the same unexpected behavior occurs again as described before. In mantra it's the same situation by the way.
So I think the problem doesn't lie in the Parser code, but might be the difference between a "simple" python job and Nuke/Mantra. Any chance that you can reproduce this behavior (e.g. with a Nuke job)?
Re: Parser Error not working as expected
Hi!
I managed to create a really simple test/debug scenario. This is my job submission:
And this is myparser.py:
When I submit this to the farm, the issue happens as described.
I managed to create a really simple test/debug scenario. This is my job submission:
Code: Select all
import sys
import af
job = af.Job('test')
mycmd = 'sleep 300'
block = af.Block('block', 'generic')
job.blocks.append(block)
block.setParser('myparser')
block.setCommand(mycmd, False)
block.setTasksName('task @#@')
for t in range(10):
task = af.Task('#' + str(t))
task.setCommand(mycmd)
block.tasks.append(task)
job.setNeedOS('')
job.send()
Code: Select all
from parsers import parser
class myparser(parser.parser):
def __init__(self):
parser.parser.__init__(self)
def do(self, data, mode):
self.error=True
Re: Parser Error not working as expected
Hi.
I see now. I repeated the same bug with Nuke.
Will try to solve this problem.
I see now. I repeated the same bug with Nuke.
Will try to solve this problem.
Timur Hairulin
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
Re: Parser Error not working as expected
Until it's fixed and if you don't need to kill the process like does, you might want to look at which doesn't seem to kill it but will restart the task (after it's done).
Code: Select all
self.error=True
Code: Select all
self.badresult=True
Re: Parser Error not working as expected
Hi.
Bug reason:
When afrender receives af::TaskExec to start from afserver, it checks whether such task (jobid,blocknum,tasknum) already exists.
On parser bad result, afrender asks task to stop and sends ParserError to afserver.
Afserver asks afrender to close ParserError task.
Afserver sends new task to afrender.
ParserError task process can be still running when the same task comes from afserver, so the same task can exists.
Afrender just ignored the same tasks, even w/o any logging.
Afserver thinks that afrender took and started a new task.
Timeouts happen here.
Bug fix:
New TaskProcess state - CLOSED.
When afserver asks afrender to close a task, task get CLOSED state.
It can leave some time with this state, waiting for process termination after ParserError, for example.
With a new CLOSED state, afrender check not just the same task exists, but the same and NOT CLOSED task exists.
https://github.com/CGRU/cgru/commit/8b0 ... 94bd9948e0
If for some reason the same and not closed task exists, afrender will not just ignore it, but ignore and print a warning log.
This can help to debug such ignores in feature.
Bug reason:
When afrender receives af::TaskExec to start from afserver, it checks whether such task (jobid,blocknum,tasknum) already exists.
On parser bad result, afrender asks task to stop and sends ParserError to afserver.
Afserver asks afrender to close ParserError task.
Afserver sends new task to afrender.
ParserError task process can be still running when the same task comes from afserver, so the same task can exists.
Afrender just ignored the same tasks, even w/o any logging.
Afserver thinks that afrender took and started a new task.
Timeouts happen here.
Bug fix:
New TaskProcess state - CLOSED.
When afserver asks afrender to close a task, task get CLOSED state.
It can leave some time with this state, waiting for process termination after ParserError, for example.
With a new CLOSED state, afrender check not just the same task exists, but the same and NOT CLOSED task exists.
https://github.com/CGRU/cgru/commit/8b0 ... 94bd9948e0
If for some reason the same and not closed task exists, afrender will not just ignore it, but ignore and print a warning log.
This can help to debug such ignores in feature.
Timur Hairulin
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
Re: Parser Error not working as expected
Could I update our production server to the current master to get the fix? Is the "pools" feature working/doing something? Any other major changes since 2.2.1, that I need to check?
Re: Parser Error not working as expected
Yes you can.
Pools are not working, but will not disturb.
The most major change is threads stack limit.
Pools are not working, but will not disturb.
The most major change is threads stack limit.
Timur Hairulin
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
Re: Parser Error not working as expected
Ok.
btw, how did you determine this is a good value for the stack?
btw, how did you determine this is a good value for the stack?