E9.2 Kernel IPC Error

Soumen

Soumen

Reputable Poster
Dear List,

Recently we had an issue with our production server where all of sudden a lot of processes crashed. When I sequenced the events it all started with this message below as the first kernel to blow up. The PID being reffered here 10428 was the security kernel.

While trying to investigate I combed through all the event logs trying to piece what could have cuased this. But so far I do not see anything in event logs or even anyting in the jde logs prior to this message.

When I checked this messaged I went ahead and termnated the security kernel, hoping the other kernels would connect to the newly spawned security kernel BUT it just could not recover and all kernels started throwing IPC errors. See second logs below.

I have tried to serach through MOS and Oracle community / jdelist for the error message reported below but could not find anyting. Is there any other pointer someone can help provide which may provide some clue.

Thanks,
Soumen
9.2
Windows 2016
SQl Server 2016
WebLogic 12c

============ 1st Log showing errors =============

7592/19688 Wed Nov 30 13:01:24.822000 netsig.c726
Net program ended, pid=7592, signal = 11

7592/19688 Wed Nov 30 13:01:24.824000 ipcmisc.c348
IPC2300010 - ipcRemoveResource called on a locked resource Krnl10428ReqQ (idx=165), unlocking it.

7592/19688 Wed Nov 30 13:01:24.827000 ipcmisc.c348
API ipcSawZombieProcV1 : process 7592 set to Zombie in entry 7

INFO: Done setting IPC Handle State structures to abandoned. Process exiting. iParam: 00000000000000007592


========= subsequent kernel logs ... posting one fo them ===============

5344/20028 Wed Nov 30 13:01:42.424000 ipcmisc.c348
IPC3700029 - sendMessage (idx=165) Timed out trying to lock queue for sending. Queue properties: alloc-1612709920, used--889192052, head--300414547, tail-196422, lastHole-2097184, highest-25976864, dataOffset-162384640, allocBytes-4282838552, usedBytes-2097154

5344/20028 Wed Nov 30 13:01:42.425000 extmsgq.c1245
Could not pass message hdrbuf to jdenet_k kernel queue=<Krnl10428ReqQ> (lastIPCError=<eIPCTimedOut>),Message aborted for msgType=568.

5344/20028 Wed Nov 30 13:01:42.425001 extmsgq.c1051
Logging callstack for jdenet_k kernel reskrnl=10428.

Process call stack dumped successfully in file <D:\JDEdwards\E920\log\jde_10428_5344_2022113013142_1_dmp.log> iParam: 00000000000000000000
5344/20028 Wed Nov 30 13:01:42.655000 extmsgq.c1324
NET2000002: Could not pass message to jdenet_k kernel queue=<Krnl10428ReqQ> (lastIPCError=<eIPCTimedOut>), Message aborted, type=568

5344/20028 Wed Nov 30 13:01:42.656000 jdecsec.c510
JDENET Error = JDENET eIPCErr: eIPCNoError

5344/20028 Wed Nov 30 13:01:42.656001 jdecsec.c3020
Failed to communicate with security server: Unable to locate Security Server
 
It sounds like your either not able to communicate with the security server from this box, or your security kernels are hung. I would consider pinging the security server box, and if you can, then terminate the existing security server kernels and let them respawn.

Tom
 
SIG 11 is access violation, so it looks like a JDE bug. Or C++ RTL's mismatch. Or a hardware fault. 7592 was a Net process, right?

MSG 568 is part of a sign-on sequence.
 
Back
Top