VM Virtual Server to Hardware Ratio

PAULDCLARK

Well Known Member
Just a quick one, on OAS Web Servers what kind of ratio are you getting, i.e. how many VM servers are you getting off a physical box?

Oracle quote up to 10:1 which seems a bit steep to me, I would have thought between 2:1 to 5:1..

Thanks
 
Paul,

2:1? wow, that wouldn't worth the effort. There is no magic number for your question, it all depends on the size of your vm-host. A properly configured VM-host will give you 80 to 90% of the horsepower of a similiarily configured physical box. There is vm-ware overhead with shared nics, ram, cpu, etc. You will want to make sure that there is lots of ram on the vm-host. You will want to avoid paging. A paging OAS physical server is slow as a dog. A paging OAS vm server is as slow as a two-legged dog. Personally, I prefer physical boxes for production, and vm boxes for test. The other issue I don't like about vm boxes is that I don't have any idea of what else the infrastructure guys are running on the hosts. Since it is so easy to shove vm boxes from host to host, the performance of a vm box can vary wildly. I perfer a good physical box any day of the week.

- Gregg
 
Turns out our infrastructure people are running 40+ servers on 3 physical machines. I'm quite impressed.
 
[ QUOTE ]
Turns out our infrastructure people are running 40+ servers on 3 physical machines. I'm quite impressed.

[/ QUOTE ]

if that is the case, I would recommend lobbying for some physical servers. Those are some very busy VM hosts. if you want a test box, put it on VM, put I would suggest physical for the prod servers, unless you don't mind whiney end users.

- Gregg
 
Hi,

It strongly depends on what you run on those VM boxes.

One thing is running an Active Directory controller
for, let's say, 100 users; a very different thing one
would be running an OAS/JAS box for those 100 guys, and
even worse would be running an SQL DB server there.
 
I don’t want to get into the ‘to VM or not to VM debate’, it’s a company standard here now, almost everything which is being updated is being virtualised. Oracle advocate it on the cloud computing paradigm, mind you I’m old and cynical enough to recall doing ‘cloud computing’ eleven years ago with Citrix!

Currently I have 80 users at peak time over 2 quad core OAS boxes and it barely touches them. I also have massive application and batch servers; average utilisation is less than 10%. In 2010 and 2011 that will change as we change a bespoke legacy system into JDE and the number of users by the end of 2011 will have changed to 800 with around 600 concurrent at peak times when the US and UK time zones cross over. The business itself has a very low transaction level, in fact I'm astonished at how low, and most of the activity is browsing.

My design has to take into account some form of failover, and personally I prefer an active\active model wherever possible as in standard clustering models there is a very expensive box sitting there doing nothing. JDE doesn’t do seamless failover (yet), I know it can be emulated, but I’ve set an expectation of around 10 minutes outage plus callout time, which is massively better than the legacy system could do, and given that two years ago an outage of more than an hour on JDE happened at least twice a month is a considerable improvements. Its enough for us, its not like a production line will stop if the system is down.

So my design (and bear in mind this won’t actually be built for a few months) calls for 10 or less WebLogic Servers, virtualised, since it isn’t even out until March I’m not going to get any figures for it at the moment. Using OAS experimentation says that I will need around 12-14 servers. (on the whole I’m not best impressed with OAS, but it’s easier to maintain than WebSphere.)

Behind that will be two BSFN Application servers and three Batch servers, one general and the other two feeding two Formscape servers through the OSA by function module.

The SQL server will be real. The deployment server can be virtual. There’s also a development system to take into account, the SQL again this will following the model, real tin for SQL, virtual for everything else. The whole lot will run Windows 2008 64bit with SQL Server 2008 64bit and run from SAN storage arrays.

So the idea is that I have lots of servers, most of them not doing a huge amount with enough capacity to cope with peak loads and the failure of a ‘server’. I’ve architected and built systems like this before, so I know it works, the only new bit for me is VM. This year we don’t have the budget for a full on geographically disbursed HA system, JDE is merely one of a number of client facing critical systems none of which have a full HA solution and it’s outside the scope of this project.

The ESX arrays themselves will be dedicated to the JDE infrastructure; in Q3 I am proposing to cut over to the new server system which will liberate 16 servers or so which they can do what they like with.

As an organisation we will be inviting vendors in to size the system according to how many physical boxes we would need to support this and then ‘add some’. So the question I originally posted was to gauge from those who’ve already done it roughly what they got from OAS, so as to perform a sanity check, I don’t need a real accurate result or formulas, just a feel for it…

Thanks all
 
"As an organisation we will be inviting vendors in to size the system according to how many physical boxes we would need to support this and then ‘add some’. So the question I originally posted was to gauge from those who’ve already done it roughly what they got from OAS, so as to perform a sanity check, I don’t need a real accurate result or formulas, just a feel for it…"

Now you've opened the flood gates. The vendor sharks will sniff out the blood you just dangled in the water.
grin.gif


Now that you elaborated, your plan sounds sound. Just watch out for paging for the virtual OAS / Weblogic boxes and you should be good to go.

One thing to look out for if you use ESX. That solution tends to make some RAM configuration changes on you when you're not looking. I've had some vm boxes set up with several gigs of ram, then came back six months later and the ram was cut in half. Apparently that's an automatic setting in VM-ware (at least that's what the infrastructure guys claim).

You mentioned a virtual deployment server. That's ok, if you don't mind the wait. Our Europe environment virtualized their deployment server and it's slow as a pig. My hardware deployment server can build a full package in 90 minutes. The virtual box takes six hours. That's ok, you just have to plan for the performance hit.

- Gregg
 
[ QUOTE ]
in standard clustering models there is a very expensive box sitting there doing nothing.

[/ QUOTE ]

OT of course, but this comment struck me.

For years I have been designing active/active clusters with SQL on one node and E1 on the other, with "cross failover" where SQL would fail to the E1 node and vice versa. The drawback is that, if one optimizes SQL to use all (or most) available physical memory, there will be memory pressure (and subsequent paging) on the node failed over to which is now running both SQL and E1. My experience though is that this is preferable to a complete outage. I disable failback and manually move the failed cluster group back to it's preferred node at the first reasonable opportunity.

No more wasted node.


If anyone wants to discuss this further, let's not hijack this thread but start another one instead.
 
Yes, have done the same myself.

Only reason I'm not on this one is that I've got a lot of capacity on VM. Still might do it for the BSFNs...
 
Thanks Greg

Good point - may I stress that I personally have NO SAY in the vendors we invite, so don't bother calling...

I did run a VM server for the DepServer last summer as part of a SQL 2005 upgrade PoC, which didn't happen - long story. I did do a full build on it, but as ever ran it overnight, it was about 60 mins slower than the dev system here which is on older software anyway. Client full build on PD here is just under 5 hours, plus 2 for the servers, so not too concerned unless the hit is similar to yours, that would hurt a bit.

I'm the only CNC here, so build full packages overnight anyway.

Thanks for the memory tip, I've added that to my watch out list.
 
[ QUOTE ]


For years I have been designing active/active clusters with SQL on one node and E1 on the other, with "cross failover" where SQL would fail to the E1 node and vice versa.

[/ QUOTE ]

Hey Jeff,

We have a similiar setup, taken one level deeper. We have JDE and the prod databases on one node, and the DEV and PY databases on the other node of the SQL cluster. We stuck enough ram on both nodes so that we can run everything on one box without a huge performance hit.

Our Asia instance went even more extreme. They have a three way cluster of batch servers. that one confused the heck out of me the first time I had to do some trouble shooting on it. That configuration is definately not in the Oracle play book....

- Gregg
 
Never tried with with OW Services, I know the IP addresses would be different for each cluster group, but I always assumed that the port would clash somewhere along the line, so never bothered to test...
 
Back
Top