VMware’s vSphere team has done it again. The most important, and best systems software company on the planet is again announcing a major upgrade to its platform that once again raises the level of its came into a different orbit than the pretenders and the contenders.
What Has VMware Announced?
There are five major new VMware announcements:
- A new version of vSphere (version 5.0), the first version that does not support the ESX hypervisor, and that only supports the ESXi hypervisor
- Site Recovery Manager (SRM) 5.0 – dramatically improved with replication and fail-back
- vShield 5.0 – An important advance in anti-virus and malware protection for vSphere
- vCloud Director 1.5 – a point release with mostly minor but still important improvements
- vSphere Storage Appliance 1.0 – a virtual appliance that takes the hard disk storage local to up to three servers and makes that storage into a virtual SAN. This is an important SMB play as it dramatically reduces the cost of deploying VMware into small (three or less) host environments.
We will be covering each of the products in individual posts as details become clear. For the balance of this post, we will focus upon the platform and its suitability for business critical and performance critical applications.
The vSphere story a year ago was that most enterprises with performance and business critical applications were stuck at around 30% virtualization with the “low hanging fruit” having been virtualized, and the tough stuff (the business and performance critical applications) remaining on physical hardware.
VMware has reported some good incremental progress on this front based upon the success of the vSphere 4 platform rolled out this year. The progress is depicted in the graph below.
With vSphere 5, VMware raised significantly raised the bar in terms of the capacity and performance of the underlying platform. The table below shows the growth in the virtual resources that can now be allocated to virtual machines in the vSphere environments.
It is clear from the above table that the capacity now exists to support almost any workload in a vSphere environment. Now the question turns to the ability to manage that capacity for resources and I/O rate on behalf of the most important applications.
A Resource Based Approach to Service Level Agreements
It is in the approach to management that VMware faces its largest challenge. If we take the platform capability numbers above at face value, then it is clear that the platform should be well capable of supporting most of the X86 based loads that are not virtualized yet. However, while those of us who are fans of virtualization, VMware and vSphere may believe it to be obvious that a platform with the above capabilities is suitable to host business critical and performance critical applications, the people that own those applications are not so easily convinced.
It is here that VMware is making good progress but still has an enormous amount of work to do. Along with vSphere 5 VMware has announced the concept of “performance guarantees”. This is a clever bit of marketing phraseology since what is being guaranteed is not the performance of the application (its response time), but rather the resources dedicated to it. What VMware claims to have solved is the “noisy neighbor” problem where an VM that is not important is consuming resources better allocated to an important (business critical or performance critical) workload.
While it is incredibly useful to be able to assure storage and network bandwidth (as well as CPU and memory) to a particular workload, doing so does not “Guarantee Applications Performance”. The reason for this is that in the eyes of the applications owner “Applications Performance” is the response time of the application as delivered by the edge of the application system to the network that delivers the experience to the end user.
The Applications Performance Assurance Hole
With vSphere 5, it is highly likely that VMware has delivered a platform to the market that is capable of hosting business and performance critical applications, but has fallen short in terms of delivering the performance management tools necessary in order to be able to measure applications performance in a manner relevant to the owners of these applications (who have the political power in many organizations to block their virtualization).
This brings up a critical point. While the vSphere 5 platform may be well capable of virtualizing these applications, they are only going to get virtualized if a response time profile for the application is created before the application is virtualized, and if ongoing response time management of the application occurs once the application is virtualized. What this means in practice is that the IT team that owns the virtualization platform is going to have to assume responsibility for the response time profiles of the business and performance critical applications running on the virtualization platform.
The lack of a strong APM offering from VMware therefore creates an opening for first class solutions from the third party VMware ecosystem that can in fact provide this information. The table below compares the leading third party solutions to this problem. The criteria used are as follows:
- Deployment Method refers to whether or not this is a product that is installed on premise, or whether it is offered as a hosted service by the vendor. It also refers to how the data is collected (either with an agent in the application, an agent in the OS, or monitoring the network via a virtual or physical mirror port).
- Supported App Types refers to the breadth of applications supported by the solution
- Application Topology Discover refers to the ability of the solution to find what is talking to what, which leads to an ability to discover the application system in its entirety
- Cloud Ready refers to the ability for the collection agent or appliance to live on a different network entirely from the management system and “phone home” into a collector in the DMZ.
- Zero Config refers to the ability of the to work out of the box and discover up front and then continuously what it needs to discover in order to work.
- Code Level Diagnostics refers to the ability to find the line of code at fault for the performance and availability problem. This is a feature designed for developers of custom applications who are supporting those applicatins in production. This is traded off against the breadth of supported application types in the second bullet above (no product provides code level diagnostics for every application, and APM solutions that support every application do not offer code level diagnostics).
|Developer Focused APM Solutions|
|AppDynamics||On Premise – Agent in
|On Premise – Agent in
|Monitoring as a Service –
only agents installed in
application run time, back end
hosted by New Relic
|On Premise – Agent in
application run time &
appliance on switch ports
|IT Operations Focused APM Solutions|
|AppFirst||Monitoring as a Service – agents
installed in guest operating systems.
Back end hosted by AppFirst.
|All applications on
Windows or Linux
|On Premise – Agent in
guest operating system
|All TCP/IP on
Windows or Linux
|ExtraHop||On Premise – Virtual and/or
physical appliance on switch
|All TCP/IP applications|
There is an important distinction in the table above. That distinction is the target use case of the solution. AppDynamics, dynaTrace, New Relic and the Quest Foglight solution are primarily focused upon helping the team that developed a custom application support that application in production. Therefore the principal value of these products is in finding the problem in the code of the custom developed application that needs to be fixed by a developer.
AppFirst, BlueStripe, and ExtraHop are targeted at an entirely different use case. That use case is the support of any and all applications no matter how procured (developed or purchased), or if developed in what manner (language, application run time, etc.). These products offer breadth of applications support (relatively speaking they support all applications) and provide value end-to-end and hop-by-hop response time information along with diagnostics that are more oriented to where in the infrastructure (and not the code) the problem lies.
Just in time for the adoption of vSphere 5 by enterprises seeking to virtualize business critical and performance critical applications, AppFirst, BlueStripe, and ExtraHop have pioneered a new category of APM solutions. This new category is focused upon allowing IT to take responsibility for applications response time for every application running in production. This is an essential step on the road toward virtualizing the 60% of the applications that remain on physical hardware.