Parser Error not working as expected

General discussions and questions.
lukasm
Posts: 4
Joined: Thu Mar 16, 2017 1:37 pm

Parser Error not working as expected

Post by lukasm »

Hello,
I'm having an issue with the error thrown by a python parser:
In my implemented nuke parser class I'm setting self.error=True. As I understood it, afanasy should then get a Parser Bad Result (PBR) error and restart this task. In my case, the task stays on the state RUN, but in the background the render process is killed. afrender output gives me: "Error: Parser bad result. Stopping task."
But instead of restarting the task, nothing happens (state stays on RUN) and after waiting some time, the afserver runs into the af_task_update_timeout threshold. Afrender tries to stop the Task, but fails with the message
AFERROR: RenderHost::stopTask: 0 tasks, no such task
TP: j6 b0 t4
Then after waiting some more the server runs into af_task_stop_timeout.

This is some strange behaviour, I think. Can you reproduce that? What am I missing here? How do I properly throw an error from the python parser?

CGRU version 2.2.0 / Python version 2.6.6 / GCC version 4.4.7
User avatar
timurhai
Site Admin
Posts: 911
Joined: Sun Jan 15, 2017 8:40 pm
Location: Russia, Korolev
Contact:

Re: Parser Error not working as expected

Post by timurhai »

Hi.

If you have af_task_update_timeout and af_task_stop_timeout, this means that afrender stopped to send any info about the task.
So after parser "self.error=True" afrender killed this task, and just forget about the task, it did not send to afserver that is PBR, and server still thinks that it is RUN.
If it is so, it is a bug.
But we need to catch it.
I always testing on a dummy task:
https://github.com/CGRU/cgru/tree/maste ... %20scripts
There is also a job.py that sends such task.py.
It can simulate various output for parsers, generate images for thumbnails.
And i just tested PBR and it works.

The best wy to catch a bug is to write the simples task.py and a job.py that sends it.
If i can reproduce it, i can debug it.
Timur Hairulin
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
lukasm
Posts: 4
Joined: Thu Mar 16, 2017 1:37 pm

Re: Parser Error not working as expected

Post by lukasm »

Hi!
I tested it with a simple job sent by job.py: When using the generic service and parser the job finished normally. Then I added just one line of code in the "do"-Function of the parser: self.error = True -> now the job still behaves normally and shows his error.
Then I switched back to Nuke, submitted a simple nuke job and added the same line to the nuke parser. After starting, the same unexpected behavior occurs again as described before. In mantra it's the same situation by the way.
So I think the problem doesn't lie in the Parser code, but might be the difference between a "simple" python job and Nuke/Mantra. Any chance that you can reproduce this behavior (e.g. with a Nuke job)?
lukasm
Posts: 4
Joined: Thu Mar 16, 2017 1:37 pm

Re: Parser Error not working as expected

Post by lukasm »

Hi!
I managed to create a really simple test/debug scenario. This is my job submission:

Code: Select all

import sys
import af

job = af.Job('test')

mycmd = 'sleep 300'

block = af.Block('block', 'generic')
job.blocks.append(block)

block.setParser('myparser')
block.setCommand(mycmd, False)
block.setTasksName('task @#@')

for t in range(10):
    task = af.Task('#' + str(t))
    task.setCommand(mycmd)
    block.tasks.append(task)

job.setNeedOS('')

job.send()
And this is myparser.py:

Code: Select all

from parsers import parser

class myparser(parser.parser):
    def __init__(self):
        parser.parser.__init__(self)
    
    def do(self, data, mode):
        self.error=True
When I submit this to the farm, the issue happens as described.
User avatar
timurhai
Site Admin
Posts: 911
Joined: Sun Jan 15, 2017 8:40 pm
Location: Russia, Korolev
Contact:

Re: Parser Error not working as expected

Post by timurhai »

Hi.
I see now. I repeated the same bug with Nuke.
Will try to solve this problem.
Timur Hairulin
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
lithorus
Posts: 28
Joined: Wed Jan 25, 2017 4:14 pm

Re: Parser Error not working as expected

Post by lithorus »

Until it's fixed and if you don't need to kill the process like

Code: Select all

self.error=True
does, you might want to look at

Code: Select all

self.badresult=True
which doesn't seem to kill it but will restart the task (after it's done).
User avatar
timurhai
Site Admin
Posts: 911
Joined: Sun Jan 15, 2017 8:40 pm
Location: Russia, Korolev
Contact:

Re: Parser Error not working as expected

Post by timurhai »

Hi.

Bug reason:
When afrender receives af::TaskExec to start from afserver, it checks whether such task (jobid,blocknum,tasknum) already exists.
On parser bad result, afrender asks task to stop and sends ParserError to afserver.
Afserver asks afrender to close ParserError task.
Afserver sends new task to afrender.
ParserError task process can be still running when the same task comes from afserver, so the same task can exists.
Afrender just ignored the same tasks, even w/o any logging.
Afserver thinks that afrender took and started a new task.
Timeouts happen here.

Bug fix:
New TaskProcess state - CLOSED.
When afserver asks afrender to close a task, task get CLOSED state.
It can leave some time with this state, waiting for process termination after ParserError, for example.
With a new CLOSED state, afrender check not just the same task exists, but the same and NOT CLOSED task exists.

https://github.com/CGRU/cgru/commit/8b0 ... 94bd9948e0

If for some reason the same and not closed task exists, afrender will not just ignore it, but ignore and print a warning log.
This can help to debug such ignores in feature.
Timur Hairulin
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
selsner
Posts: 47
Joined: Wed Jan 25, 2017 11:20 am

Re: Parser Error not working as expected

Post by selsner »

Could I update our production server to the current master to get the fix? Is the "pools" feature working/doing something? Any other major changes since 2.2.1, that I need to check?
CGRU 2.3.1 - CentOS 7.7

Sebastian Elsner - Pipeline Technical Director - RISE
www.risefx.com
User avatar
timurhai
Site Admin
Posts: 911
Joined: Sun Jan 15, 2017 8:40 pm
Location: Russia, Korolev
Contact:

Re: Parser Error not working as expected

Post by timurhai »

Yes you can.
Pools are not working, but will not disturb.
The most major change is threads stack limit.
Timur Hairulin
CGRU 3.3.1, Ubuntu 20.04, 22.04, MS Windows 10 (clients only).
selsner
Posts: 47
Joined: Wed Jan 25, 2017 11:20 am

Re: Parser Error not working as expected

Post by selsner »

Ok.

btw, how did you determine this is a good value for the stack?
CGRU 2.3.1 - CentOS 7.7

Sebastian Elsner - Pipeline Technical Director - RISE
www.risefx.com
Post Reply