Wednesday, July 20, 2011

Scale Up with VMware vSphere 5: “I’m Not Dead Yet!”

Well, what a crazy two weeks. It looks like Kevin Bacon became a VMware employee this week:

Since this post will be long and filled with numbers, here’s a summary:

The cost difference to license VMware vSphere 5 in a Scale Out vs. Scale Up scenario is roughly equivalent, BUT if you take into account the additional hardware and Microsoft license costs required, Scale Up holds it own against Scale Out. Remember, we have to build a TOTAL solution, not just look at one small piece of the overall cost.

Now that the panic has died down a bit, I wanted to show some very surprising numbers that I found last week regarding Scale Up vs. Scale Out with vSphere 5. Many people’s first reaction to high memory environments was “I’m just going to buy 96GB servers from now on!” I’ve run the numbers and the short answer is you shouldn’t do that if your main concern is a higher TCA (Total Cost of Acquisition).

For those unfamiliar with the terms, what are Scale Out and Scale Up? Scale Out refers to the concept of hosting many small servers in a cluster to provide resources. For this article, I will define a 2xCPU server with 96GB of memory as the baseline for Scale Out. In Scale Out, you have many a higher number of smaller sized servers. The main advantage to Scale Out is a higher utilization per node if you plan for an outage in the environment. Today we use n-1 or n-2 typically for planning purposes. This means we set aside the capacity equal to one or two hosts spread across the cluster. Scale Out also allows you to start small and grow large over time.

Scale Up is the concept of using a smaller number of servers with a larger capacity to achieve that same cluster size of resources. In this instance, you have a smaller number of larger sized servers.  For this article I classify a Scale Up server as anything greater than 2xCPU or 96GB of memory. The advantage to Scale Up is fewer management points and typically a smaller cost per unit. In a Scale Up scenario, you invest up front and see greater savings over time.

I know many customers have different theories with either model and no one model fits every customer. If you take into account budget cycles, internal politics, facilities, existing standards, etc. you may not be able to make a decision based solely on the numbers presented here. I also concede that some just aren’t comfortable with Scale Up and the potential virtual machine density it can provide. In Scale Up’s defense, I contend that the advances to vMotion and HA in both vSphere 4.1 and 5 we will see this fear start to ease over time as everyone becomes more comfortable with the enhancements to the technologies.

If VMware license costs aren’t the Scale Out culprit, who is?

Everyone get your pitchforks; let’s figure out who we need to string up next! As I stated previously, it is now cost neutral for Scale Out and Scale Up with vSphere. But, what about Microsoft licenses? What about hardware costs for new servers? What about soft facility costs like power, cooling and space? If you are going to build a solution you need to factor in ALL the costs, not just a single data point.

Let’s take each item one by one.
  • Microsoft License Costs – Uncle Bill Gates (or should I say Uncle Steve Balmer these days) gets paid one way or the other. In larger environments, the most cost effective way today to license a VMware server is purchase a Data Center License for each host. A Microsoft Data Center License costs $2999 MSRP per socket. This leads to a minimum $6000 “mTax” per server! 
  • Physical Server Costs – There are many things to consider here. We need to buy a server, at least 96GB of memory, 10GB NICs or a bunch of 1GB NICs, and we can’t forget about hardware maintenance for 3-5 years. I went online to the HP website and configured a 2xCPU, 96GB, 2x10GB NICs DL360 for about $11,000 MSRP. I then calculated the same server with 192GB for about $24,000 MSRP. I’m going to use these two numbers as my data points. Let’s call this value the “pTax”. 
  • Soft Costs – This includes many things that are hard to quantify but NEED to be considered in calculations. It costs money to own and operate a server. They take power, they take cooling, they need people to manage them and we have to pay this staff (at least the good ones), they consume network ports, we have to plug cables into them, etc. Every thing listed contributes to the cost of the solution. Let’s call this value the “sTax”. This isn’t a fantasy world of free servers and facilities. No Rainbows and Unicorns for You!

Still with me? Eyes haven’t glazed over yet? In for the long haul? Ready for some numbers? Read On.

I created a spreadsheet modeled after the awesome Rynardt Spies’s Blog Post:

For this exercise, I’m assuming only 2xCPU servers per the pTax bullet above and comparing 96GB and 192GB configurations. I’m also assuming vSphere Enterprise Plus and MSFT Data Canter Licenses for each server. This is a worst-case scenario: a scenario where the entire infrastructure is purchased up front. In environments of this size, it is often considered bad practice to “go back to the well” for more funding later. I have consulted with many customers in which this has been the case over the years.

Let’s look at a small four host, Scale Up 192GB per server, cluster vs. an equivalent amount of 96GB servers. To calculate the number of equivalent 96GB Scale Out servers, we need to compare the amount of the Configured Memory value (highlighted below). Using an n-1 failure scenario, you can see if you take the amount of cluster memory (768 GB) on the 192GB line and then subtract out the memory needed to support an n-1 host failure (3 hosts * 192GB = 576) you can achieve an equivalent amount of configured memory utilizing Scale Out 96GB with seven hosts.

Now that we know the number of hosts for Scale Out and Scale Up, we can calculate the total cost. In this example 3 subcomponents, vSphere cost, MS DC cost, and hardware cost, provide the total cost. vSphere cost is equal to the number of licenses * $3495, MS DC costs is equal to the number of licenses * $2999, and hardware is equal to the number of hosts * either $24,000 for Scale Up or $11,000 for Scale Out.

As you can see in this example, Scale Out is marginally more expensive (I’d argue that the cost is close enough to be “in the noise” and the costs are roughly equal). Scale Up is a viable solution.

What happens when we double the cluster size to 8 Scale Up hosts? Using the same calculations we see Scale Up start to pull ahead.

Let’s go big! In this example I doubled again to 16 Scale Up hosts. In this example I used n-2 for the host failure scenario due to the increased cluster size. The greater we scale, the greater the benefits.

In conclusion, Scale Up as a general rule isn’t more expensive. As memory costs fall, this trend will accelerate over time. Scale Up is often equal too or less than the TCA (Total Cost of Acquisition) for an equivalent amount of Scale Out computing. Scale Up can provide a beneficial cost structure while providing less management points in your environment. Remember, if you are going to include the “vTax”, you’ll need to include the “mTax” “pTax” and “sTax” as well!

Scale Out Computing’s Not Dead Yet!

I want to thank Maish, Andrew Storrs & vTexan as well as some of my VCE & VMware peeps for helping me out with the numbers and providing clarity. This post was a train wreck to start would not have been possible without them!!!


wuffers said...

How do the new vRAM entitlements affect you?

Please take 2 minutes of your time to fill out this vSphere 5 migration survey:

We need more data! Results will be posted in the main vSphere 5 licensing thread over at VMTN:

First round of results here:

Anonymous said...

I don't see how your calculations have any relevance to the vTAX discussion.

The only thing you have proven is that VMware have eliminated the benefit of Scaling UP.

Virtualizations equals Scaling Up. That means that VMware is about to ruin the benefits of virtulization.

In your calculations you used 12, 28 and 56 vSphere licenses. Where you with vSphere4 licensing only needed 8, 16 and 32 licenses. That means that VMware took:
12x$3495=$41940 and

That's a lot of money VMware have pulled out of the business case, and there is not doubt in my mind the customers will respond negatively. VMware is constantly trying to move the focus, and convince everyone that this is fair, but to treat paying customers that have been buying software subscription to upgrade like this is not fair. I am stating to suspect that you have been headhunting people from M$ marketing and licensing department.

ChrisR said...

I believe that if you scale out long enough this can be cheaper that scaling up. Why? Because if you have, let's say 100 ESX hosts, you can drop a lot of redundancy in each host. If one host fails.. big deal, there are plenty of them. So no more redundant power supplies, rundandant boot disks, expensive 24x7 hardware support etc. in other words: you should design to fail. And if you do, you can get away with way cheaper hardware en support contracts. If you just have a few scaled up hosts you don't want to risk loosing a host for a day or 2 so you need more expensive hardware, not just more RAM banks!

Aaron Delp said...

@Anonymous - If you are referring to vSphere 4 to vSphere 5, then yes, that isn't covered here. It's been done to death and it wasn't what I was interested in nor the focus of this article.

Since my companies blade solutions START at 96GB, I needed to know the impact of Scale Up vs Scale Out so I can speak to my customers.

You may have frustrations around the upgrade process and I get that but this isn't the forum for that.

You'll note that I DON'T work for VMware and I reserve the right to publish or delete comments on my PERSONAL blog as I see fit.

Everyone please consider that in future comments.

Aaron Delp said...

Hey ChrisR - I would have to see some numbers but I suspect that the Microsoft Licenses will kill you. If you were running Open Source, I agree that might work because you do a much cheaper server and incur minimal license costs.

I scaled numbers up to 64 hosts and servers with up to 228GB RAM and my calculations all reflect the same thing. I didn't do 384GB but it would probably kill the Scale Out model using this scenario.

Thanks for your comment!

Paul Lembo said...


Thanks for treating this in a level headed way. I wish more people covering technology didn't treat business decisions like a referendum on if their wife was hot or not.


Craig said...

scale up and scale out require to be done parallel, by just scale out on the host level doesn't solve the performance and redundancy for the application requirement. You can verify this with any of the application or ERP architect, they can share a lot of experience. for the TCO to run the entire infrastructure, you need to calculate the 10 switches, cabling, Fabric switches, management efforts and also power cost. The so call scale out today with 96GB memory in 2 way, is no longer justify, as you can have 10 cores or 12 cores per socket in 2 way server nows day. Eventually 6 cores will be replace by newer chip set. If you run your own DC, please check the cost per U space and also the power cost to support all the device you have in the data center, huge energy cost will be going into the bill and make the cloud more expensive. I will not say scale up is the only way to go, but you will always require to find the balance in between the CPU and memory to size it for most common situation scenario.

Anonymous said...

What I take away from this is that VMware suddenly have a lot bigger piece of the total cost than before... and compared to all their competitors. It's interesting to see that Microsoft licensing benefits properly to scale out while there is virtually no benefit on VMware licensing from scaling out. Your whole cost saving with scaling out is from Microsoft licensing! Might aswell use those same licenses to run Hyper-V.