Scheduler Kernel dying

mtrottier

Well Known Member
We are running 8.12 SP 8.96.1.4 on windows with Oracle.

I hesitate even posting this because I have so little to go on and I’ve yet to identify any kind of pattern but our scheduler kernel keeps dying at random times and without any errors in the scheduler logs. In fact sometimes when we check the P91300/W91300G “Scheduler Server Control” screen it will even show that the scheduler server is still running.

Anybody ever had a similar problem?
 
Not sure if this pertains to your setup but...

We had a similiar problem a few months back where the scheduler kernel was dying randomly every 1-3 days with little info in the logs. We found that there was another scheduler kernel running on another logic server that was not being used as the scheduler server. We disabled this unneeded scheduler kernel, restarted JDE services on our logic servers and have not had the problem since.
 
Hi,

I had the exact same problem with the same Tools Release
but on UDB/DB2 database.
It was solved after installing Tools 8.96.1.5
 
Thanks for the reply but eventhough we do have a second enterprise server for DV and PY there is no scheduler kernal is running on that server.
 
I was afraid someone was going to suggest a tools release upgrade. I’ve already done a couple of 8.97 upgrades and I still don’t trust it. I’m also afraid that after going through the upgrade that the problem won’t go away because at this point Oracle Support won’t confirm that there is a fix for it in the SP.

By the way we are running a Windows Cluster here and I already went the rounds with Oracle because there is not one word in any of the 8.97 Server Manager documentation about clustering. I learned the hard way when I did the install that the instructions for clustering in the JDE Install Documentation are completely wrong.
 
Hi Mike,

The Tools upgrade I'm suggesting is 8.96.1.4 to 8.96.1.5;
tn that case, you don't need to install Server Manager.
 
Mike, I had this exact same issue on ERP 8.0 a few years ago. We ended up having our night shift operator stop and restart the scheduler kernel every night because we couldn't determine what was causing it to simply stop submitting jobs. In the end we went with a more robust third party scheduler because of the lack of confidence in the built in one. I am surprised, although not a lot, in that the same type of event is still happening. It appears to be AS/400 related though.
 
Hi Mike,

We have the same problem with the scheduler with 8.10 SP8.96.1.5 Oracle database Clustered.

What I have notice is if the Kernel can not communicate with the Oracle database, the Kernel dies. Users can logon, submit reports, but the scheduler is dead.

It is simple: If the JDE scheduler is not work in the morning, I call the ORACLE DBA and ask if he had problem with the DB and 99% of the time, there was a failover, or something going on. 99% of the time, he forgets to reset the JDE services.
 
We have the exact same issue. First we had 2 specific jobs that were causing the scheduler clock to "freeze" and would have to be manualy reset. Oracles: reccomendation: take the jobs out of the scheudler and closed the case.
(1 week passes...)
Now our scheduler kernel goes zombie anywhere from 2 days to two weeks w/o any discernible pattern. I have had a call open with Oracle for six weeks and no luck so far. First we installed the Windows Debugging tools in hopes of getting a dump file.(When the process dies the debugger does not kick in as Oracle claimed it would) Then they sent me a custom JDENET_K which was supposed to have "special debugging" in it. (Not that it ever worked.) We are clustered but it fails on either node although Node 2 has been up for a record 15 days).

My only suggestions -

One - religiously monitor ES SAW and clean up any zombie kernels. Seems to help.

Second - ESU #JK13356.

Third - the "[JDENET_KERNEL_DEF6]" section of the jde.ini file for port 6014 on ES change the "maxNumberOfProcesses=100" setting to "maxNumberOfProcesses=2". As a general rule of thumb, you should have up to 6 to 10 users per call object kernel.
 
John,

There is a fourth option - don't use the scheduler. There are some very good third party scheduler applications that are more robust.

Gregg Larkin
North American JDE Systems Engineer
 
Gregg,

Your absolutely correct. I recommended Tidal, AppWorx, and Silk. Unfortunately for this client, they are cost prohibitive. They are currently evaluating BPA7, but it does not have an E1 adapter so we are looking at runube.exe as part of the replacement. Not ideal.
 
John,

We have been using Tidal for four years now. Very good product. Has a bit of a learning curve, but that's to be expected for a scheduler that can do JDE, batch jobs, Oracle, SAP, Webmethods, Cognos and more. We brought it in for JDE, it has grown into a key application for batch processing for a whole lot more than JDE.

Gregg
 
Mike,

I can also confirm that moving to 8.96.1.5 fixed this issue for us. I lost a lot of sleep over this issue because it would always quit around 12:05am so i had to reset it. I was extremely happy when we got the issue resolved
smile.gif
 
John,

So I see from your tag that you are on 8.96.2.2 and you still see the issue?

If that's true that would eliminate the idea that upgrading my tools release to resolve the issue.
 
Mike,
We are in process of upgrading from 8.96c1 to 8.96.2.3 ( we are on wintel/SQL platform). I was about to go ahead with the upgrade when I read your post. I decided to install Scheduler in my QA environment to test it. It has been running fine for last 2 days. I will let it burn for a week before determining if its working or not.

Question to every one else who has encountered the issue, Did you all had this issue only with Oracle on Windows or did you see it on other platform and specifically if someone saw it onw SQL.

Thanks
 
we had a similar problem right back as far as ERP8.0. What was killing it was the oracle analyze of the F98OWSEC table. Once we removed this table from the analyze it no longer died. Have also kept this table out of the analyze in 8.12. Hope this helps.
 
Maybe 8.96.2.3 is better. I have a developer involved, and so far all we can determine is that when the ES looses connectivity to the DB, the Scheduler kernel dies. No rhyme or reason why it losses connectivity or whats causing it.
 
Back
Top