System is becoming unavailable for few minutes very often ... Any ideas ?

antoine_mpo

Reputable Poster
Hi list,

Our production platform is out-sourced, and so i don't take care of it. But since a few days, it's a real mess !!
The system is slow, and everyday (not at any particular time) whenever, the system is getting extremely slow and at the end unavailable for a few minutes, and it's getting ok after, without any particular process to solve it.

We use web client (2 web servers using Webshpere load balancing - "clone" as they say), windows platforms (oracle server, jde server, bsfn server, ube server, printing server) and a html compression device (BoostEdge).
When the system is unavailable, after a very long time we get 2 errors messages from BoostEdge :
error #503 (gateway timeout)
error #504 (service unavailable).

But it's not a boostedge issue (when you try to connect locally, it's not working neither).

There are "expert" that are auditing hardware, network, oracle, ... but they don't seem to find what's wrong.

Does someone ever meet those kind of symptoms ?
Any ideas or clue ?

It's not that i want to help the out-sourcing company (hey, they now do the most part of my previous work as an jde/system administrator !), but it's for our end-users !!
They really can't work for several days !!

Thanks for your help.
 
Hi Antoine,

Have you checked for virus, spyware, adware, trojans, etc?

Sebastian Sajaroff

Antoine,

As-tu verifié s'il y a des virus?

Sebastian
 
Hi Sebastian,

Yes viruses have been checked.
They also thought of a "deny of service" attack, but it doesn't seem to be the case. They put a sniffer at the entry point but didn't identify obvious things.

Bonjour Sebastian,
(Cool encore un francophone !)
Oui ils ont vérifié les virus.
Ils ont pensé a une attaque de type "déni de service" mais ça n'a pas l'air très probant. Ils ont posé une sonde réseau mais rien de flagrant.
 
Hi Antoine,
I have a customer with a similar configuration (all servers in outsource - EO 8.93 - Web interface - 2 servers WebSphere 4.05 horizontal and vertical clone Windows 2000 - different on EnterpriseOne Server AS400).
Only one suggestion, configure the network cards speed and respective switch port speed equal. So we configured servers network cards speeds to 100 M full/duplex and respective port switch speed to 100 M full/duplex.
Before this configuration we had a similar problem.
Questions:
1) Have you got almost one terminal connection or generator client connection? If yes, you can connect correctly to TS and then to EnterpriseOne?
2) If not, when you say " .. (when you try to connect locally, it's not working neither)...", what means?

I hope this help you
Good luck
Gigi
 
I had the same issue after upgrading to WAS7. The issues was a plug-in parameter that by default in WAS6.1 was set to 0 (no timeout) and in WAS7 by default is set to 60 seconds. Most of our queries take longer than 60 seconds. I set the value back to 0 since that is what worked for us in WAS 6.1 and the sporadic HTTP 504 Gateway errors have stopped. I found this by looking at the FFDC log on the application side and the HTTP_plugin.log on the HTTP side. The FFDC log (exception logs)showed com.ibm.wsspi.webcontainer.ClosedConnectionException and the HTTP_plugin.log showed ServerIOTimeout fired. Timeout 60. This value is changed in the plugin-cfg.xml file. I did it by accessing the WAS Admin console, go to Application Servers, your application server, Web Server plugin Properties and then change teh Read/Write timeout to 0 or something higher than 60. You can also manually change it in the plugin-cfg.xml which is typically found
/QIBM/UserData/WebSphere/AppServer/V7/ND/profiles/yourprofile/config/cells/yourcell/nodes/yournode/servers/IHS_yourwebserver/plugin-cfg.xml
 
Back
Top