vSphere 4.1 Improvements in Availability

With the release of vSphere 4.1 there have been some great enhancements that have been added with this release.  In one of my earlier post I took a look at the vSphere 4.1 release of ESXi.  This post I am going to take a look at vSphere 4.1 availability options and enhancements. So what has changed with this release?  A maximum of 320 virtual machines per host has been firmly set.  In vSphere 4.0 there were different VM/Host limitations for DRS as well as different rules for VMware HA. VMware has also raised the number of virtual machines that can be run in a single cluster from 1280 in 4.0 to 3000 in the vSphere 4.1 release. How do these improvements affect your upgrade planning?

With this information let’s look at some specific VMware HA limitations. Now when I say limitation it is important to understand the HA limits are POST-FAILOVER limits.  What I mean by that is that you have to make sure that you will be running no more than 320 virtual machines per host after any failover has happened.  In other words if you have a five node cluster that is configured to tolerate a single host failure then the total number of virtual machines you can run on that cluster would be 1280.  I got to that number by taking the size of the cluster minus the number of host failures you would be able to tolerate. That would be a maximum of 320 virtual machines per host running on four hosts 320 x 4 = 1280.  Same scenario except you need to be able to tolerate a two host failure would give you the virtual machine maximum of 320 virtual machines on three physical hosts 320 x 3 = 960. When designing your infrastructure make sure to keep that in mind.

When in the vCenter client you will find that the HA dashboard has been updated with a new detailed window called Cluster Operational Status.  This window displays more information about the current state of HA in the cluster, operational status and also includes specific status and errors for each of the hosts in the cluster itself.

VMware has also made VMware DRS and HA play better together by improving the interoperability between the two services and getting them tightly integrated.  This interaction comes when DRS performs vMotions to free up contentious resources on one host that that HA can place a virtual machines that need to be restarted. VMware has also added some more APIs for HA app awareness for 3rd party developers to work this.  I imagine we will be seeing applications taking advantage of this in the very near future.

I still consider VMware Fault Tolerance to still be in its infancy stage but improvements are being made. With the release of vSphere 4.1 Fault Tolerance virtual machines can take advantage of DRS functionality in initial placement and for load balancing.  One thing to make note of, EVC is now required to be enabled. This requirement gives DRS better performance when determining the location of where it can place Fault Tolerance virtual machines. vSphere 4.1 also introduces a Fault Tolerance specific version-control mechanism that allows the primary and secondary virtual machines to run on different hosts with different but compatible patch levels.

All in all the vSphere 4.1 release is a great enhancement of the vSphere product and it continues to show the improvements that have been made and continues to set itself apart from the other hypervisors on the market. Just remember the limits on the number of virtual machines and requirements for EVC during your planning stages.