OAS WebCache Dropping Connections

Soul Glo

Soul Glo

VIP Member
I had 4 OAS application servers with WebCache in my test environment for over 6 months, this past weekend I moved the servers to PROD, exact configuration and upgraded the ES with TR 8.98.2. Now all of a sudden with more users WebCache is crapping out when the system reaches about 100 users. The JAS.ini file is configured so that each instance allows 100 connections so I know that is not where the issues is.

I also notice that when this spiral happens, if I just re-start the JAS instances it does nothing but when I re-start webcache everything starts working again. I notice that in the webcache event log file it is saying that the connection was dropped. I used a solution I found on the KG to increase the webcache http.conf file to allow more connections I set it to allow 550 and I set the LOADLIMIT in the webcache.xml file to allow 100 per instance and I had really hoped that this was the answer. However as soon as the system approached 100 users they started to get kicked out. I re-started webcache and then they could login again. So that is telling me that there is something not right in the webcache connection settings.

I have uploaded the WebCache event log file.

If anyone has any input I will appreciate it because evidently my severity 1 SR with Oracle is not being addressed. I opened it yesterday and everything they suggest is something I asked them their thoughts on...it is amazing where the support has gone, they are almost useless.
confused.gif
 

Attachments

  • 155511-Copy (2) of event_log.txt
    199.9 KB · Views: 471
[ QUOTE ]
If anyone has any input I will appreciate it because evidently my severity 1 SR with Oracle is not being addressed. I opened it yesterday and everything they suggest is something I asked them their thoughts on...it is amazing where the support has gone, they are almost useless.
confused.gif


[/ QUOTE ]

Cleola,

Time to play hard ball. If you are not getting answers, tell them to escalate to second or third level support. Especially if you are getting a newbie tech from a call center from Asia. Tell them to escalate it to someone in Denver. Those folks are usually more seasoned.

Good luck
- Gregg
 
I have...I got our account manager involved today but OMG Oracle and their log files, I think I have uploaded about 50 between today and yesterday. I finally told them that if they don't know the cause then tell me that don't keep asking for the same logs, the error is still the exact same.

I think they realized that I was frustrated. I am trying to upgrade my old 8.96 WAS instances to 8.98.2 in the event I need to roll
 
Which HTTPD.CONF file the erb cache one or the ones for the JAS instances?
 
[ QUOTE ]
What is your ThreadsPerChild setting in HTTPD.CONF file?

[/ QUOTE ]

for the webcache httpd.conf file it is set at 50

ThreadsPerChild 50
 
When you setup the Origin server (JAS server) in web cache there is a field named capacity which is the maximum number of concurrent connections. I think the default is 100 which is the approx number of users which you mentioned as well. Can you verify this ?

I don't have real experience with setting up web cache in prod so this is just a suggestion..
 
Hi Cleo,

50 is way too small for that many users (no - I wasn't referring to the We4b cache instance). We had a similiar issue which was resolved by upping this number from the default.

Oracle's documentation and knowlebase articles provides wildly differing recommendations for this setting (none of which align with the default).

Actual Apache documentation explains this setting as follows:
-----------------------------
ThreadsPerChild
Syntax: ThreadsPerChild number
Default: ThreadsPerChild 50
Context: server config
Status: core (Windows, NetWare)
Compatibility: Available only with Apache 1.3 and later with Windows
This directive tells the server how many threads it should use. This is the maximum number of connections the server can handle at once; be sure and set this number high enough for your site if you get a lot of hits.
-------------------------------
There's a way to monitor the connections which I can't recall right now - but the nature of the connections is transitory - i.e. when loading a web page and there are graphic elements, etc to be retrieved in populating the web page a connection is opened between the client and the server to download the item. If you've applied the registry patch to change the max IE connections from 2 to 10 you'll hit the server max connections limit much quicker.

Gotta run but consider upping this number dramatically from the default.
 
Alas my dear friends once again the good documentation comes from IBM.

Where is all the Oracle tuning stuff? Oh yeah it doesn't exist and if it did you'd be replacing it was WebLogic tuning in a few years anyways.

(Okay no stones please).

Here's the tuning stuff you need for the HTTP Server:

http://publib.boulder.ibm.com/httpserv/ihsdiag/ihs_performance.html

http://publib.boulder.ibm.com/httpserv/ihsdiag/2.0/mod_mpmstats.html

http://www.apache.org/server-status

http://www-01.ibm.com/support/docview.wss?rs=177&context=SSEQTJ&q1=monitor+http+thread&uid=swg21167658&loc=en_US&cs=utf-8&lang=en

Go ahead and knock yourself out. All these settings depend on the PLATFORM that the HTTP server is running on.

I made it my mission to make this my personal science and did extensive testing (on Windows).

My tests with 400 - 500 concurrent users on WAS 6.1 with 2 servers with 3 JVM's on each showed that the IBM Tuning Documents were WAYYYYY OFF (Both OAS and WAS use Apache so it's the same).

I really, really hated in University when an experiment would show "no measurable difference" but that's pretty much what I found. As long as the server is fast enough and you didn't set the numbers studipdly low or high it didn;t matter.

Moreover on Windows other registry settings matter much more than tinkering with the Threads and all that.

On the AS400.............well it's an AS400. Until you get 1000's of users you can just let it do what it does best - take care of itself.

Preliminary tests on other O/S's (ala Unix flavours) do show some benefit to tuning these parameters.

Colin
 
Hi Soul, last year I had same problem ... connection dropped by Oracle Web Cache..... problem seems to be known (I attach a doc downloaded from metalink3). From this document I extracted some information and allow me to investigate on this problem. Probably I didn't solve completly the problem but I mitigate it. Problem is related to Windows tcp ip protocol and Windows network connection management

At the end of my investigation I setted this parameter in Windows registry

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
MaxFreeTcbs -> 10000
MaxHashTableSize -> 2048
MaxUserPorts -> 65534
TcpTimedWaitDelay -> 30


good luck
gg
 

Attachments

  • 155679-RegistriWin2003.doc
    55 KB · Views: 450
[ QUOTE ]
There's a way to monitor the connections which I can't recall right now...

[/ QUOTE ]

The httpd.conf for ThreadsPerChild (defaults=50) can be monitored using OAS Enterprise Manager Console under the following:

HTTP_Server > Response and Load Metrics > Thread Usage

Need to put this up so others and myself will just lookup the jdelist for this valuable info.
grin.gif


Hope it helps...
 
Back
Top