UBEs Failing - max resources Exceeded

HolderAndrew

Well Known Member
Hi All,

we are running many subsystem UBEs and 'normally' submitted UBEs daily on AS400 iseries (R810) V4R5M0 with Xe Base. We have had a relatively trouble free production environment now for the last couple of years but recently we have been experiencing some serious UBE failures. The first thing that we notice were job failures from the UBEs which are submitted from Job Scheduler via the RUNUBE command. The log on the AS400 is as follows:

*************** START *********
C2M1601 Escape 30 30/09/04 19:18:05 QC2UTIL1 QSYS *STMT RUNUBE B7333SYS *STMT

From module . . . . . . . . : QC2SIGNL
From procedure . . . . . . : raise
Statement . . . . . . . . . : 5
To module . . . . . . . . . : RUNUBE
To procedure . . . . . . . : main
Statement . . . . . . . . . : 334 *PRCLT
Message . . . . : Signal SIGABRT raised (abnormal termination).
Cause . . . . . : The signal SIGABRT was raised to indicate an abnormal
termination.
************ END ********************

Initially I thought that this might be related to the MCH3601 errors that we got about 2 years ago and which was related to a DataLogics PDF support PTF that we needed. The module in question for those errors was DLAPDFLKG. But I don't think that this is the same problem. This error has happened now about 6 or 7 times in the last week. When the error occurs we shut down services and restart. The strange thing is sometimes jobs still fail immediately after the restart and others work ok.

The jde.log that we are getting (on the server) on both manually submitted UBEs and on UBEs submitted from Job Scheduler is as follows:


******** START OF JDE.LOG ***************
Sep 30 19:44:50 ** **** jdeDebugInit -- output disabled in INI file.
4551665

18276 Thu Sep 30 19:44:49 2004 ipcpub.c2903
process 18276 <B7333SYS/PRINTUBE> registered in entry 87

18276 Thu Sep 30 19:44:49 2004 runbatch.c631
Startup for User=SUIHKJOF, Env=PDFIN7333, Job#=4551665

18276 Thu Sep 30 19:44:49 2004 ipccrt.c216
IPC2100004 - ~Krnl18276RspQ not created, maxNumberOfResources parameter (1000) exceeded on allocation of IPCData.

18276 Thu Sep 30 19:44:49 2004 jdb_ctl.c1991
Net init failed or not initialized

18276 Thu Sep 30 19:44:51 2004 runbatch.c1101
RUNBATCH: Remote CP=1252, Remote OS=5, Local CP=37, ConvertToASCII=0

******** END OF JDE.LOG **************

The debug log (first page only!) is as follows:

********** START OF DEBUG.LOG ************
Sep 30 19:58:04 ** **** jdeDebugInit -- output to file.
<B7333SYS/RUNUBE> registered in entry 90

/RUNUBE> registered in entry 90
18307 Thu Sep 30 19:58:03 2004 runube.c537
Startup for User=WHSBATCH, Env=PDFIN7333, Report=R592000S, Version=TF8004

Sep 30 19:58:03 ** KERNEL type = KERNEL_UBE
Sep 30 19:58:03 ** JDERT_IsRealTimeEnabled setting g_IsRealTimeEnabled as 0
Sep 30 19:58:03 ** JDERT_IsRealTimeEnabled setting g_IsXAPIEnabled as 0
Sep 30 19:58:03 ** ONEWORLD Session started in ENV PDFIN7333 for WHSBATCH
Sep 30 19:58:03 ** Entering JDB_InitEnv
Sep 30 19:58:03 ** IPC2100004 - ~Krnl18307RspQ not created, maxNumberOfResources parameter (1000) exceeded on allocation of IPCData.
18307 Thu Sep 30 19:58:03 2004 ipccrt.c216
IPC2100004 - ~Krnl18307RspQ not created, maxNumberOfResources parameter (1000) exceeded on allocation of IPCData.

18307 Thu Sep 30 19:58:03 2004 jdb_ctl.c1991
Net init failed or not initialized

Sep 30 19:58:03 ** Entering JDB_SetEnv
Sep 30 19:58:03 ** Entering JDB_InitUser with commit mode 0.
Sep 30 19:58:03 ** Entering JDB_BeginTransaction
Sep 30 19:58:03 ** Locking /PD7333/specfile/gbrlink.ddb in READ mode.
Sep 30 19:58:03 ** LOCK: Total READ locks after operation: 1

********** END OF DEBUG LOG ************


It appears that we are now exceeding somekind of limit with IPC resources?

Part of our jde.ini file (which I hope is the relevant bit!)is as follows:


^JDEIPC]
maxNumberOfResources=1000
startIPCKeyValue=3001
avgResourceNameLength=15
maxMsgqEntries=1024
maxMsgqBytes=65536
ipcTrace=0

^JDENET]
serviceNameListen=6010
serviceNameConnect=6010
maxNetProcesses=20
maxNetConnections=2800
netShutdownInterval=15
maxKernelProcesses=100
maxKernelRanges=13
netTrace=0

I would be very grateful if somebody could suggest some actions that might help us to solve this problem. We could of course just increase the 1000 resource parameter but it would be nice to know why we have suddenly reached this limit? And anyway we are not sure that by just raising this value might introduce some other errors! We have contacted peoplesoft for their advice but I was hoping that some of you experienced guys out there have already seen this problem and know the answer!

Thanks for any replies in advance.

Andrew Holder

OW Xe, SP18.1, as400 v4r5m0, citrix :confused:
 
Likely reason is running passed IPC threshold limit (1000).

Key question is why now?

Has anything changed in recent past?

Did any job run a very long time uninterrupted, accumulating all those resources over time?

Was any UBE ended via ENDJOB *IMMED, not giving it a chance to cleanup resources?

Could it be something as simply that the workload has grown over the past two years, and threshold has been reached now?

Good luck.
 
Hi Princess!

Thanks for the reply. According to the users and operations here there is no evidence that anything has changed. When jobs are shut down we use the 'E' type records to shut down subsystem UBEs in a controlled fashion and hold all job queues. When all UBE's have processed we then end services and CLRIPC before restarting. We do have huge volumes here so maybe we have just reached a new threshold?

Regards,

Andrew
 
I'll pass your comments onto the client and pray that our OS upgrade to V5R2 next month works! Unfortunately for now we still have a problem to solve.
 
Back
Top