Unique problem

Jaise James

Reputable Poster
All,

I just started to notice a unique issue in my environment. We are on e810 ( 8.96.2.3) on wintel/SQL?Websphere platform.

User also uses citrix server/fat client and this problem happens when jobs are submitted from both web and fat client

Problem is for a random period none of the job submitted to our batch server runs, it justs errors out. Batch server is running properly. Here is the error that I get in the log of job

DENET Error = eTimeOut

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.021000 jdecsec.c2537
Failed to communicate with security server: Unable to locate Security Server

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.052000 jdecsec.c323
Validate user by token has failed

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.083000 Jdb_ctl.c3687
JDB1100030 - Failed to complete Security check by Token

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.099000 Jdb_omp1.c627
JDB9900246 - Failed to find existence of default OMAP for environment STARTUP

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.114000 Jdb_rq1.c1795
JDB3100011 - Failed to get location of table F00941 for environment STARTUP

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.146000 Jdb_omp1.c627
JDB9900246 - Failed to find existence of default OMAP for environment STARTUP

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.161000 Jdb_rq1.c1795
JDB3100011 - Failed to get location of table F9861 for environment STARTUP

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.177000 evtcache.c892
isNewRTESystemEnabled - OMW table F9861 cannot be opened. RTE object check will fail.

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.224000 Runbatch.c515
JDB_InitEnvOvrExtToken failed with rcode = 0

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.239000 ipcmisc.c299
API ipcSawUnregisterProcV1 : process 7644 unregistered in entry 45

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.239001 Runbatch.c1340
Processing PrintUBE request failed - see previous messages

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.271000 Runbatch.c1357
Batch job 1750952 ended in error



At the same time I see following error
1168/2044 MAIN_THREAD Sun Oct 17 00:17:38.611000 jdeksec.c770
INITIALIZING SECURITY SERVER KERNEL

1168/2044 MAIN_THREAD Sun Oct 17 00:17:50.673000 secnodemgr.c645
WARNING:Default Single Sign-on node configuration is being used. This is a potential security risk. Please refer to EnterpriseOne Security Administration Guide to setup Single Sign-on Node Configuration and Single Sign-on Token Lifetime Configuration.

1168/2044 MAIN_THREAD Sun Oct 17 07:01:14.377000 Netqueue.c2604
putExternalQueue0x04 (kernel) failed for msg id 5, pid=6316, queue name=<Krnl6316RspQ>, lastIPCError=<eIPCNotFound>.

1168/2044 MAIN_THREAD Mon Oct 18 09:18:05.724000 Netqueue.c2604
putExternalQueue0x04 (kernel) failed for msg id 4, pid=6904, queue name=<Krnl6904RspQ>, lastIPCError=<eIPCNotFound>.

1168/2044 MAIN_THREAD Mon Oct 18 11:28:20.849000 Netqueue.c2604
putExternalQueue0x04 (kernel) failed for msg id 4, pid=7244, queue name=<Krnl7244RspQ>, lastIPCError=<eIPCNotFound>.

1168/2044 MAIN_THREAD Mon Oct 18 11:28:50.693000 Netqueue.c2604
putExternalQueue0x04 (kernel) failed for msg id 4, pid=2576, queue name=<Krnl2576RspQ>, lastIPCError=<eIPCNotFound>.

1168/2044 MAIN_THREAD Mon Oct 18 11:29:20.286000 Netqueue.c2604


THis batch server is its own security server. We have been running this for over 5 year without this issue. We have not changed anythign recently.

Does anyone has any idea
 
I think you should actually look at the network first, and see why you're getting timeout messages.

Out of curiosity, do you have a section in your server's JDE.INI for [TRUSTED NODE], and is it active? We usually like to make sure it's commented out, unless we are running some long process.

The rest of the messages are a result of the failure to communicate. (Insert Cool Hand Luke joke here)
 
[ QUOTE ]
All,

I just started to notice a unique issue in my environment. We are on e810 ( 8.96.2.3) on wintel/SQL?Websphere platform.

User also uses citrix server/fat client and this problem happens when jobs are submitted from both web and fat client

Problem is for a random period none of the job submitted to our batch server runs, it justs errors out. Batch server is running properly. Here is the error that I get in the log of job

DENET Error = eTimeOut

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.021000 jdecsec.c2537
Failed to communicate with security server: Unable to locate Security Server

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.052000 jdecsec.c323
Validate user by token has failed

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.083000 Jdb_ctl.c3687
JDB1100030 - Failed to complete Security check by Token

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.099000 Jdb_omp1.c627
JDB9900246 - Failed to find existence of default OMAP for environment STARTUP

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.114000 Jdb_rq1.c1795
JDB3100011 - Failed to get location of table F00941 for environment STARTUP

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.146000 Jdb_omp1.c627
JDB9900246 - Failed to find existence of default OMAP for environment STARTUP

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.161000 Jdb_rq1.c1795
JDB3100011 - Failed to get location of table F9861 for environment STARTUP

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.177000 evtcache.c892
isNewRTESystemEnabled - OMW table F9861 cannot be opened. RTE object check will fail.

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.224000 Runbatch.c515
JDB_InitEnvOvrExtToken failed with rcode = 0

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.239000 ipcmisc.c299
API ipcSawUnregisterProcV1 : process 7644 unregistered in entry 45

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.239001 Runbatch.c1340
Processing PrintUBE request failed - see previous messages

7644/8020 MAIN_THREAD Mon Oct 18 11:29:50.271000 Runbatch.c1357
Batch job 1750952 ended in error



At the same time I see following error
1168/2044 MAIN_THREAD Sun Oct 17 00:17:38.611000 jdeksec.c770
INITIALIZING SECURITY SERVER KERNEL

1168/2044 MAIN_THREAD Sun Oct 17 00:17:50.673000 secnodemgr.c645
WARNING:Default Single Sign-on node configuration is being used. This is a potential security risk. Please refer to EnterpriseOne Security Administration Guide to setup Single Sign-on Node Configuration and Single Sign-on Token Lifetime Configuration.

1168/2044 MAIN_THREAD Sun Oct 17 07:01:14.377000 Netqueue.c2604
putExternalQueue0x04 (kernel) failed for msg id 5, pid=6316, queue name=<Krnl6316RspQ>, lastIPCError=<eIPCNotFound>.

1168/2044 MAIN_THREAD Mon Oct 18 09:18:05.724000 Netqueue.c2604
putExternalQueue0x04 (kernel) failed for msg id 4, pid=6904, queue name=<Krnl6904RspQ>, lastIPCError=<eIPCNotFound>.

1168/2044 MAIN_THREAD Mon Oct 18 11:28:20.849000 Netqueue.c2604
putExternalQueue0x04 (kernel) failed for msg id 4, pid=7244, queue name=<Krnl7244RspQ>, lastIPCError=<eIPCNotFound>.

1168/2044 MAIN_THREAD Mon Oct 18 11:28:50.693000 Netqueue.c2604
putExternalQueue0x04 (kernel) failed for msg id 4, pid=2576, queue name=<Krnl2576RspQ>, lastIPCError=<eIPCNotFound>.

1168/2044 MAIN_THREAD Mon Oct 18 11:29:20.286000 Netqueue.c2604


THis batch server is its own security server. We have been running this for over 5 year without this issue. We have not changed anythign recently.

Does anyone has any idea

[/ QUOTE ]

Is this before or after a recent services restart?
 
All,

Network is fine. Checked everything on it. This is our batch server which acts as the its own security server, Hence, I am not sure why it would have to go to network for atleast security server part. As I said, it continues to work fine before, during and after the issue. Some random job fails. Same user is able to submit the job later on without a problem.

Logs are from around same time.
 
How many security Kernels do you have started on this Batch Server. Have you looked at the batch server thru SAW when the issue occurs. Does the security kernel have outstanding requests
 
Nick04,
Seems like your security kernel is recylcing and the old connections are not able to reconnect.

Check your INI to find when it recycles.

Chan
 
We have 10 security kernels. We enabled trace on the security kernel and found nothing out of the ordinary. Oracle is reviewing it but have no clue so far
 
Hello Chan,

Not sure if I understood what you meant. Where can I check when is security kernel recycling in jde.ini
 
It could related to IPC resources running out..?

What does your JDEIPC section look like on your Batch server's JDE.INI]

Check oracle support doc - ID 631804.1
 
There was a session leak in this kernel, which I came across not so long ago. This was fixed in the latest TR (or before, not sure when).
 
Since i saw RTE in your log its my guess its in jde.ini look for reauthentication i think since i am not near the machine; noramlly it happens when the connection for the id you use to connect expires and the old request are still processing . But looking at your other logs details it might be even network issue. Check if you have any firewall which you may not be aware of?.

Chan
 
unless you have replicated system tables on the logic server, it will still need to traverse the network to validate the security information. As mentioned elsewhere, check your trusted node settings as well as your kernel recycling settings (in the JDE.INI on the logic server)
 
Back
Top