www.BinaryAlchemy.de :: View topic - ExeecuteOnce randomly runs multiple times
 SearchSearch   RegisterRegister  ProfileProfile   UsergroupsUsergroups   Log inLog in 
If you create a new post, please use a topic that describes your problem
Documento sin título
 
ExeecuteOnce randomly runs multiple times

 
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    www.BinaryAlchemy.de Forum Index -> old - RR Questions - v6.x
View previous topic :: View next topic  
Author Message

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Thu Jan 09, 2014 12:16 pm    Post subject: ExeecuteOnce randomly runs multiple times Reply with quote

Hi, I use some "Execute once" Jobs that are submitted via commandline and xml descriptive file.

The job calls a .bat with parameters (that perform operations to modify files rights to make it read-only).

The job is bound to the same user and forced to a unique machine.

the .bat returns a correct 0 exit code.

Randomly, RR seems to resubmit several times the same task again and again , and displays no error in the job log section.

Everything looks correct in the job and the logs, but the global job log never displays "Sucessfull 1-1,1"

Sometimes, the exact same type of job goes successsfull and finished at first attempt, sometimes it will run 5 times.

Any idea ?

I have a debug file, I can upload at your request

Rgds
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Thu Jan 09, 2014 12:54 pm    Post subject: Reply with quote

Yes, please upload the debug file via www.RoyalRender.de/upload_r.php
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Thu Jan 09, 2014 1:01 pm    Post subject: Reply with quote

thnaks, it's uploaded:140109_11_debugInfo_CGEV.zip
Back to top
View user's profile Send private message

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Tue Jan 14, 2014 4:08 pm    Post subject: Reply with quote

Hi,
have you get time to check that debug ?

thanks
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Tue Jan 14, 2014 4:57 pm    Post subject: Reply with quote

Hi

There was not job debug file nor a website html file in the zip.
So I need to do some tests.

What do you have as output image in rrControl for the job?
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Tue Jan 14, 2014 5:00 pm    Post subject: Reply with quote

Ouptut is set to execOnce.file ( by default for this type of job I assume)
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Fri Jan 17, 2014 2:37 pm    Post subject: Reply with quote

I tested some jobs, but I was not able to repro the issue.
Please update RR to 6.2.31 or 7.0b9+6.2.33
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Fri Jan 17, 2014 6:28 pm    Post subject: Reply with quote

We were thinking at an explanation why those jobs are running several time even if first attempt finished with correct output code 0.

I believe that type of execute Job does not check for rendered frames ( as there are no frames rendered), and I imagine it relies on the ouput code from the logs exclusively (Am I right?)
Thoses days, the storage that hosts RR folders and also many projects is seriously stressed daily, by the renderfam and large number of artists access, it may be a bit (really) slow to respond sometimes..

Is it possible that rr server takes so much time to access the log that it considers the job as hang and launches a new try ?

If it is the case, would there be any (server) log somewhere that could confirm the behaviour ?
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Fri Jan 17, 2014 6:45 pm    Post subject: Reply with quote

No, execute jobs do not rely on any files.
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Fri Jan 17, 2014 6:46 pm    Post subject: Reply with quote

so they should just launch and consider it as done once log is created ?

what does rr use to know the task is finished ?
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Sat Jan 18, 2014 2:08 pm    Post subject: Reply with quote

The client returns that the job was executed.
Have you updated?
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Mon Jan 20, 2014 12:18 pm    Post subject: Reply with quote

Hi, Ill update as soon as possible but there are some critiacal renders theses days, and I can't take any risk.

I'll update at the first available time window.


Rgds
Back to top
View user's profile Send private message

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Thu Jan 23, 2014 3:13 pm    Post subject: Reply with quote

Hi, I'll probably update this week-end, we are just a few hours from the a big delivery, if I touch anything now and it goes wrong, they'll kill me Smile

Waiting for that I'm still trying to investigate on what makes the jbs loop, and I can't find any reason why soime do and some don't .

I even tried to resubmit jobs that failed ( ie run 5 times and finally disable with the "Job sent 3 times more than frame exists") using the same " rrsubmitterconsole job.xml" methode and they do work.

While i was checking, I've noticed some jobs ( not necessarily faulty ones) had the same ID :
at the moment I can see 4 different jobs named {74F}, is that normal or sign of something going wrong ? (note they've been created in the same 5 secs time lapse)

BTW is there a debug I could enable to see what happens on server side when it receives the "Job Done" message from client and decides to requeue it ?
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Thu Jan 23, 2014 5:07 pm    Post subject: Reply with quote

Jobs should never have the same ID.
To submit multiple jobs after another you have a few choices:
-Use one xml for all jobs
-set the preID flag in the xml or commandline
-wait 50-100 Milliseconds between sending jobs.
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Thu Jan 23, 2014 5:10 pm    Post subject: Reply with quote

Thanks, we have an automation that submits 2 jobs in a row, I need to change that.

What are the side effects of having multiple jbs with same ID ?
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Thu Jan 23, 2014 5:18 pm    Post subject: Reply with quote

All communication is done via job ID.
E.g. Job_Finished(ID)
The server searches the queue from start to end and if it finds the ID, it is added. Once for that and only for that job.
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Thu Jan 23, 2014 5:22 pm    Post subject: Reply with quote

Ok so If I have 2 times the same ID, it may corrupt something ?

to be sure, is that exact same time of submit that leads to same ID ?
What if 2 artists submit same time a job from different machine ?

strangely the jobs that have same Id don't seem to loop more or less than other..
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Thu Jan 23, 2014 5:32 pm    Post subject: Reply with quote

The ID is generated of
- A random number 0-255
- The last two IP address numbers
- The current time in milliseconds (loops every 49,7 days)

The get milliseconds function of the OS is sometimes updated every 7 ms.
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Thu Jan 23, 2014 5:37 pm    Post subject: Reply with quote

I must be mistaken

my Ids are 3 letters/numbers beetween {}

Isn't that the full ID ?
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Thu Jan 23, 2014 5:46 pm    Post subject: Reply with quote

Oh, and I forgot another part, the preID of the current submission. If you submit multiple jobs at once.

The full ID is a 64bit number.
Because most humans cannot remember a few numbers like Job Nr: 234403430345905209543
I replaced the numbers by an alphanumeric systsm with 36 characters "ABCDEFGHJKLMNOPQRSTUVWXYZ1234567890-"

The full ID represented with this would contain of 7+2preID letters.
{ASEDDEAab}
The folders RR/website/.. have the full ID.

But as this is still too long to communicate between humans and not easy to remember, all displays are cut to 3 letters.
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Thu Jan 23, 2014 6:03 pm    Post subject: Reply with quote

OK, it makes more sense

In the meantime I had added some delay beetween submit to get different {3digits} id, and i was up to 3 seconds Smile

Now I've check the full IDs on website and even for jobs that are submitted in the same script, there is enough things done beetween both to get different Ids :


8WT_225W
8WT_BB5W

Here I imagine 8WT is the random part, 5W the last digits of ip and 22 and BB based on time , right ?

I'll check if some jobs get the exact same Full ID, but so far I can't see such things in the website...

False alert about that Smile
Back to top
View user's profile Send private message

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Thu Jan 23, 2014 6:22 pm    Post subject: Reply with quote

UPDATE:
I've found some duplicates.

It would probably explain strange behaviours ( jobs semm to disappear/reappear)

I'll fix that in the code, deploy and see how it gets
Back to top
View user's profile Send private message

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Wed Jan 29, 2014 12:25 pm    Post subject: Reply with quote

Hi, just to keep you in touch:
I've finally used the preId system to differenciate my jobs and it totally solved the problem of jobs looping, I'm now quite sure the duplicate ID were the source of all the problems.

My Royal Render was also subject to crash and now looks more stable and fast.

I've also limited the number of "wait for approval" jobs by changing the way we work, it seems to fasten the dispatching loop, and brings more reactiveness ( totally subjective)

Problem solved, i'll try to upgrade and move storage as soon as possible, but with less pressure Smile

Rgds
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Wed Jan 29, 2014 12:31 pm    Post subject: Reply with quote

>I've also limited the number of "wait for approval" jobs by changing the way we work,
>it seems to fasten the dispatching loop, and brings more reactiveness ( totally subjective)
Looping through 100 or 1000 jobs if they are disabled/waitfor is not a large difference in RR7.0.
In RR 6.x it could have an effect if the server is a bit slower.
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Wed Jan 29, 2014 12:36 pm    Post subject: Reply with quote

I'm in RR6 but the machine is strong.
As I told, it's probably subjective, because it brings a much clearer view when I watch all the jobs of the company (finished grey ones look more transparent, and easy to ignore Smile )

when jobs are in wait for approval in v6, doesn't rr check for frames?

I've had several time the case where after a rr restart, some old jobs in "wait for approval " had some frames missing for external reasons, and rr was rerendering them, not considering job as really done.
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Wed Jan 29, 2014 12:38 pm    Post subject: Reply with quote

>jobs are in wait for approval in v6
RR does nothing of you do not send a command.

I have to check if a restart causes the job to check. But afaik this behavior was changed 2 months ago.
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Wed Jan 29, 2014 12:40 pm    Post subject: Reply with quote

and also the big difference for us is we had like 600 compo jobs in wait for approval, and only 5% of them were finally approoved a few days later( the good ones to generate delivery quicktimes).

We were keeping 95% of useless jobs in Wait For Approval just for nothing, and asking artists to clean their jobs had become an everyday task ( they don't like emptyness Smile )

After 700 jobs, rr had to autoclear finished jobs, and 3D guys were pretty disappointed never getting access to the logs of their "finished" jobs.

Their life just changed Smile
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Wed Jan 29, 2014 12:41 pm    Post subject: Reply with quote

>After 700 jobs, rr had to autoclear finished jobs
There is a setting in rrConfig when to remove finished jobs automatically.
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Wed Jan 29, 2014 12:42 pm    Post subject: Reply with quote

schoenberger wrote:
>jobs are in wait for approval in v6
RR does nothing of you do not send a command.

I have to check if a restart causes the job to check. But afaik this behavior was changed 2 months ago.



My version is surely older than 2 monthes(6.02.09) , and I'am 100% sure wfa or disabled jobs were re-running if frames were missing

Quote:
>After 700 jobs, rr had to autoclear finished jobs
There is a setting in rrConfig when to remove finished jobs automatically.


Yep, i used to set it to 1000, but as far as artists never clean, it just pushed the problem 300 jhobs away Smile

The only solution was to clean unfinished jobs after a shorter delay (via the options)
Back to top
View user's profile Send private message

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Wed Jan 29, 2014 12:51 pm    Post subject: Reply with quote

The current method is much efficient:

No more approval tasks, and compo jobs are in auto-approve (for now).
The quicktime tasks are set in "post done" job and un-selected by default.

When a version is finally good to be encoded in quicktime, they just take the finished job, they click the desired post job, save , and do a reset /keep frames.

If the job has been archived, they resubmit the job, clicking the required post job, and keep frame.

Like that we don't spam ourselfes with 95% of useless wfa jobs, and we keep it for future usefull features for 3D guys..

Rgds
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Wed Jan 29, 2014 1:02 pm    Post subject: Reply with quote

Resetting a job deletes the render log files. But you said they want to keep the render logs?

If you upgrade, I would recommend that you place the quicktime generation into a python script.
Pyhton scripts can be executed like enable/disable/reset in rrControl with a right-click.
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Wed Jan 29, 2014 1:08 pm    Post subject: Reply with quote

About the reset issue I mentioned. I have logged:
#rr3968 Control: New command: set status back to approval state
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Wed Jan 29, 2014 1:54 pm    Post subject: Reply with quote

about the log loss:
the post done are for nuke/compo jobs ( to generate quicktimes)
they don't care of the logs, they only read it when it crashes.

The guys at maya/3D don't use post done on their jobs ( not relevant) so they always keep their logs.

About the python option, is this in version 7 ?

I'll have to check ths cost of it considering I have 100 lics Smile


Anyway for the Quicktime generation, it is only a matter of time, we plan to make them at another level: later through the asset manager, when the status of the images changes from "review" to "delivery", it will submit a dedicated render job.
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Wed Jan 29, 2014 1:59 pm    Post subject: Reply with quote

>the post done are for nuke/compo jobs ( to generate quicktimes)
You could enabled it for all Nuke jobs only?

>About the python option, is this in version 7 ?
Yes, a complete python rewrite.
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Wed Jan 29, 2014 2:14 pm    Post subject: Reply with quote

schoenberger wrote:
>the post done are for nuke/compo jobs ( to generate quicktimes)
You could enabled it for all Nuke jobs only?


No because we only generate those quicktime for images that are set to be delivered (maybe 5% of rendered jobs) and we never know in advance if the shot will be validated and subject to delivery (to editing).
Having it always enabled would require generating 95% of quicktimes for nothing, eating bandwidth on storages for nothing.


Rgds
Back to top
View user's profile Send private message

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Wed Jan 29, 2014 5:47 pm    Post subject: Reply with quote

By the way,
Is tehre a method to autoclean finished jobs of a particular user or autodelete as soon as finished ?

I have a user (batchtask) dedicated to ...batch tasks (like quicktime encoding or sealing file rights) and it generates 100s of jobs a day.


Currently i try to delete his finished jobs everyday, but when I don't it pushes quickly to the global limit of 700, and finished jobs that are not so old ( 2 days) have to be deleted to keep under the limit.
Back to top
View user's profile Send private message

schoenberger
Site Admin


Joined: 02 Mar 2005
Posts: 3786

PostPosted: Wed Jan 29, 2014 6:01 pm    Post subject: Reply with quote

Not in RR 6.0.
Perhaps 7.0, but at least RR 7.1 will have a console application to send commands to the current job. Like delete.
_________________
Holger Schönberger
Binary Alchemy - digital materialization
Back to top
View user's profile Send private message Send e-mail

pbillet



Joined: 24 May 2012
Posts: 155
Location/Company/Country: Paris/CGEV Studio/France

PostPosted: Wed Jan 29, 2014 6:03 pm    Post subject: Reply with quote

OK I'll consider upgrading Smile
Back to top
View user's profile Send private message
Display posts from previous:   
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    www.BinaryAlchemy.de Forum Index -> old - RR Questions - v6.x All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
 
Documento sin título
 



Powered by phpBB © 2001, 2002 phpBB Group



Number of shameful bots caught by Anti-Spam ACP: 1667