dhelsley
Well Known Member
HpUx, Xe, sp14.2, Oracle 8.1.5
We have a situation that causes our queues to backup. I was wondering if anyone else has seen it. Or, if anyone with a similar environment could test it for me I'd appreciate it.
Some of our vresions of R43500 have a large number of sections with ER and layout overrides. Because of this the specs are sent to our enterprise server (.pak) and merged into the enterprise server specifications before running the UBE. While this is happening an exclusive lock (a mutex in SAW) is issued for the UBE and version. This lock prevents another iteration of the same UBE and version from attempting to merge their specs at the same time.
Now here's the problem. Merging the specs for this UBE takes, on average, 90 seconds (I've seen it take 3 minutes). During our peak we have dozens of users submitting this job to our multithreaded queue at about 1 every 15 seconds. Each of these locks the next out for 90 seconds. Soon we have every thread of our queue filled with the same UBE/version in an "S" status. The mutex only blocks the same UBE and version, but since all of the threads are waiting to run this report nothing else can get in.
To create the situation just override the ER for every section in a version of R43500 and submit it with logs. You can tell if you have the same situation by checking the time spent "Starting" your UBE. This is the time between Startup and RUNBATCH. Notice the time between those two in this bit of jde.log. The actual job ran in 8 seconds. Merging the specs took three minutes.
14702 Tue Jun 5 10:42:41 2001 runbatch.c618
Startup for User=DHELSLE, Env=DV7333, Job#=73481
14702 Tue Jun 5 10:45:42 2001 runbatch.c1066
RUNBATCH: Remote CP=1252, Remote OS=5, Local CP=1252,
14702 Tue Jun 5 10:45:50 2001 ipcpub.c3214
API ipcSawUnregisterProcV1 : process 14702 unregistered inIt
This occurs on all three of our Unix enterprise servers. We've tested it on an NT app server and the delay is 2-3 seconds instead of 90.
I use R43500 as an example here. I've tested other UBEs with substantial amounts of ER logic. The result is the same.
Thanks for your help.
David D. Helsley, Inc.
Independent IT Consultant
[email protected]
<P ID="edit"><FONT SIZE=-1>Edited by dhelsley on 6/8/01 04:36 AM.</FONT></P>
We have a situation that causes our queues to backup. I was wondering if anyone else has seen it. Or, if anyone with a similar environment could test it for me I'd appreciate it.
Some of our vresions of R43500 have a large number of sections with ER and layout overrides. Because of this the specs are sent to our enterprise server (.pak) and merged into the enterprise server specifications before running the UBE. While this is happening an exclusive lock (a mutex in SAW) is issued for the UBE and version. This lock prevents another iteration of the same UBE and version from attempting to merge their specs at the same time.
Now here's the problem. Merging the specs for this UBE takes, on average, 90 seconds (I've seen it take 3 minutes). During our peak we have dozens of users submitting this job to our multithreaded queue at about 1 every 15 seconds. Each of these locks the next out for 90 seconds. Soon we have every thread of our queue filled with the same UBE/version in an "S" status. The mutex only blocks the same UBE and version, but since all of the threads are waiting to run this report nothing else can get in.
To create the situation just override the ER for every section in a version of R43500 and submit it with logs. You can tell if you have the same situation by checking the time spent "Starting" your UBE. This is the time between Startup and RUNBATCH. Notice the time between those two in this bit of jde.log. The actual job ran in 8 seconds. Merging the specs took three minutes.
14702 Tue Jun 5 10:42:41 2001 runbatch.c618
Startup for User=DHELSLE, Env=DV7333, Job#=73481
14702 Tue Jun 5 10:45:42 2001 runbatch.c1066
RUNBATCH: Remote CP=1252, Remote OS=5, Local CP=1252,
14702 Tue Jun 5 10:45:50 2001 ipcpub.c3214
API ipcSawUnregisterProcV1 : process 14702 unregistered inIt
This occurs on all three of our Unix enterprise servers. We've tested it on an NT app server and the delay is 2-3 seconds instead of 90.
I use R43500 as an example here. I've tested other UBEs with substantial amounts of ER logic. The result is the same.
Thanks for your help.
David D. Helsley, Inc.
Independent IT Consultant
[email protected]
<P ID="edit"><FONT SIZE=-1>Edited by dhelsley on 6/8/01 04:36 AM.</FONT></P>