Performance Assurance for Private Clouds

As virtualization matures, great progress is being made towards the goal of allowing performance sensitive applications to run on virtualized platforms. The performance and scalability gains delivered by VMware vSphere are a huge step in this direction. Other good steps in this direction are:

  • vApp from VMware which allows a multi-server application to be encapsulated in one OVF file and managed as an entity.
  • AppSpeed from VMware and other applications performance monitoring solutions that are virtualization aware from vendors like BlueStripe, dynaTrace, Optier, and OPNET.
  • Maturity in infrastructure performance management solutions from vendors like Akorri and Virtual Instruments.
  • More sophisticated management offerings from VMware like LifeCycle Manager and the forthcoming Config Control product as well as a wealth of third party solutions from vendors like Hyper9, Embotics, Surgient and newScale.
  • The entry of EMC Ionix into the business of managing both physical and virtual infrastructure and performance.

However, there are several challenges associated with migrating Tier 1 applications into a private cloud that are not yet fully addressed. Some of those challenges are in the management and performance domains and consist of the following items:

  • N-tier applications systems have to be identified as such and treated as a collection of tiers by the management platform for the virtualized environment. vApp is a great first step in this area but there remains work to be done.
  • One or more of the tiers (the web servers for example) need to be able to scale out to additional virtual machines or even additional physical hardware upon demand
  • Some parts of the application may not be virtualized right now or may never get virtualized. The management platform for the application needs to be able to deal with applications that are running on a mixture of virtual and physical hardware, and be able to migrate workloads between physical and virtual workloads based upon performance polices
  • The operation of the management platform for these applications needs to be in the hands of the teams that own these applications, which is a different constituency than the VMware administrator.
  • This problem cannot be solved by virtualization specific management consoles like VMware Virtual Center, since vCenter (by way of example) has no knowledge of workloads running on physical servers, and is not an appropriate tool to be used by constituencies like the teams that support these performance critical applications.
  • Not all applications can be and should be treated the same. There are clearly applications that are higher priority than others, and there are clearly some applications that have absolute (as opposed to relative) performance requirements.
  • For those applications with absolute performance requirements (application response time for any transaction can never be less than .5 seconds), absolute shares of CPU, Memory, Network and I/O Operations bandwidth need to be reserved so that these requirements can be met.
  • Again for these kinds of applications, the resource reservations need to be managed potentially across virtualization platforms, and across both virtual and physical infrastructures.

Just as nature abhors a vacuum, venture funded startups abhor unsolved problems that have money associated with solving them, so there is some good progress being made on these front by a couple of vendors. Quite specifically, this is not just a performance management problem that can be dealt with via monitoring, nor is it just a virtualization management problem that can be dealt with via a management product designed for applications support teams. Solving this problem requires some leading edge features from both disciplines, with the added complexity of having to deal with both virtual and physical resources for these kinds of workloads.

The first vendor that is taking a crack at this problem is Platform Computing. Platform Computing is a significant company that has been around for 17 years (not a startup) with a market leadership position in workload management for high performance grids (if you are going to simulate a rocket launch on a grid of 2000 computers and the simulation has to be complete in X seconds you would use Platform Computing’s HPC products for this task). Platform therefore has a great deal of technical and customer experience in the area of ensuring job completion time for these types of HPC workloads.


Platform has just announced the delivery of Platform ISF which is designed to “create a shared computing infrastructure from physical and virtual heterogeneous resources to deliver broad application environments with efficient workload-smart and resource-aware policy capabilities”. Platform ISF is a management solution in the sense that it gives applications teams the ability to do self service provisioning based upon policies. ISF is also a workload management product in the sense that it is able to guarantee a combination of virtual and physical resources that are needed to ensure that a given application performs as expected. ISF does not currently monitor the performance of the applications from a response time perspective in order to grab additional resources needed to assure adequate performance, but this is clearly a direction that this product will evolve towards. For more information about Platform ISF and to download a free evaluation version, visit the Platform ISF product page on the Platform Computing web site.

The next vendor that is focused upon this problem is Fortisphere. Fortisphere is lead by Siki Giunta who was previously the CEO of Managed Objects, a vendor that was a leader in the Business Service Management space. Fortisphere has announced the availability of its Virtual Service Manager (VSM) which combines Fortisphere’s historical strengths in Inventory, Configuration, and Role based Management with a new focus upon creating Virtual Service Tiers with guaranteed levels of performance. Like Platform ISF, Fortisphere currently uses resource utilization as a proxy for performance but it is reasonable to expect the company to either develop or partner to get a response time capability into its solution.


Summary Analysis

Performance assurance for n-tier applications running across dynamic physical and virtual infrastructures is a demanding and unsolved problem. However, it is a problem that must be solved in order for the most business critical of applications to move from a dedicated, static, and silo’ed  physical infrastructure to a dynamic infrastructure that places workloads on the correct mix of physical and logical compute, network and storage resources. Attacking this problem requires assets from the virtualization management realm, the performance management realm and the workload management realm. As of this moment in time, no one has all of these bases covered to a complete degree. However, both Platform and Fortisphere have made enough progress so that their solutions are worthy of evaluation by enterprise customers that wish to achieve the CAPEX and OPEX savings associated with placing their most important and demanding applications on a dynamic infrastructure.

Posted in IT as a Service, SDDC & Hybrid CloudTagged , , , , , , , , ,