Intermittent Scheduler Fallover after Database Upgrade

CC_CNC

Member
Hi Folks,

I'm hoping you can help with an issue we've been experiencing with our JDE OneWorld / EnterpriseOne production environment recently.

We have an issue where the scheduler stops submitting jobs at random times, mainly after 6pm and before 8am (we have test jobs scheduled to run every 5 minutes, submitting to each server, in order to see when the failures occur). The issue occurs at the same time on all servers, showing that the error seems to be with the scheduler, rather than the report queue itself. By restarting & resetting the scheduler, all pending jobs are then submitted, and the system continues to function as usual for a while, until the issue occurs again. In addition to this, the first manually submitted job (to each server) of the day fails to submit, with subsequent jobs submitting and running successfully immediatly afterwards. The services are not taken down overnight, but it seems that something goes wrong causing processes on the enterprise servers to fall over, and the first job of the day is required to poke the services into waking up / fixing this issue. Everything else appears to work correctly, and the scheduler does work most of the time (once it's been restarted).

Our system setup is as follows. JDE B733 SP20 running on Windows 2000 enterprise servers and an oracle 10g database hosted on unix (solaris 10), using an EMC NAS connected via a private vlan to hold it's data. The issues started to occur after we moved from our old database server, which ran oracle 8i and sat on unix (solaris 8) using its local harddisk for data storage. Sadly our implementation has been customised to the point where we cannot apply more recent service packs or easily upgrade to a later version of JDE. We plan to replace the system with our company's European finance system in the next couple of years, but as a result of the credit crunch that project was put on hold, meaning we've had to invest in upgrading the system to get it through a few more years in the short term.

Looking at the oracle logs and various server logs there are no signs of issues with the network io, cpu, disk or memory of any of the servers, or any database errors. The JDE logs on the enterprise servers do show "ACCESS_VIOLATION" errors and "(WSAECONNRESET): Connection was reset by peer" around the same times that the scheduler drops out.

We've thought of a number of work arounds, but are hoping to find a way to resolve the issue, rather than relying on a bodged process. We've also brough up our disaster recovery environment to see if we can recreate the issues there, and to experiment with different configurations, so have a few things in the pipeline. However if anyone has come across issues such as this before, or knows of other logs / methods we could use in order to investigate this problem, your ideas would be most appreciated.

Thank you once again for your help,

John
 
Is the time about the same time as the backups are running ?

The WSAECONNECT errors refer to the database timing out or dropping the database connection (or, at least, from the JDE side).

The fact that your queues are also suffering this indicates to me that at some point during the night, your database is disconnecting sessions for some reason.
 
Hi Jon,

Good suggestion, though sadly we'd wonderred that ourselves, so tried turning off all backups (not just JDE) for a night, but found the scheduler still had issues.

We've now replicated our live environment using our DR boxes on a different site, and are finding that this system also falls over at random times (not in synch with the live system's errors).

Thanks for the suggestion,

JB
 
Hi All,
FYI: We've been working with some CNC experts from SysTime, who suggested replaceing the oracle clients installed on our servers with earlier releases. Moving back to the 8i clients, and updating the corresponding DLLs in the JDE.INI files and on the F98611 to use JDBOCI80 seems to have done the trick.
Thanks again to all for reading this post and offering suggestions.
Kind regards,
John
 
Can you name them ???? :)

On Mon, Nov 24, 2008 at 9:34 AM, CC_CNC



--
Regards,
Nitin Shetye
 
Back
Top