Application Performance Management for Virtualization Comes of Age

Since the major push on the part of VMware with vSphere is to virtualize Tier 1 applications, it is important to understand how the Application Performance Management solutions for virtualized Tier 1 applications is evolving in support of this trend. Before we go into the APM vendors in detail it it important to note that we define three layers of performance management for virtualized systems. These three layers are Infrastructure Performance Management, Applications Performance Management, and Transaction Performance Management. These layers are defined the Virtualized Performance and Capacity Management White paper which is available for download in the Performance Management section of the White Papers page on this site.

VMware shipped AppSpeed in May and put a stake in the ground regarding how applications performance should be managed in virtualized environments. Since then much has changed in the third party APM ecosystem. Before we go into all of the changes from other vendors, let’s review the position that VMware took with the release of AppSpeed:

  1. Application Response Time is critical. It is more critical for understanding the performance of the application than any other metric, especially metrics based upon how how much resource the application was using. This is true because the degree of resource sharing that occurs in a virtual environments makes resource utilization (or its inverse) a poor proxy for application performance and end user experience.
  2. Automatic Discovery is essential. AppSpeed (for the applications that it supports) does a great job of discovering (and rediscovering) the tiers of the application system that comprise the application and where they are currently running. This is essential in a virtual environment that is also dynamic since the dynamic nature of the environment causes application components to be moved around.
  3. The virtual network in the VMware host is an important and valid measurement point. Since the AppSeed collector sits on a virtual mirror port (a promiscuous port) in the VMware host, it can see the interactions between applications running on guests within that host and between that host and other hosts. This is the only approach that provides this level of visibility other than putting an agent in the guest.

Now on to what has changed in the industry. Over the course of the last three months vendors have made a series of announcements that indicate a mature focus upon the problem of managing applications performance for Tier 1 applications in production. An analysis of what all of these announcements together means follows the announcement summaries.

New Relic adds support for Java applications and Introduces On Demand Pricing

New Relic a vendor with a SAAS delivered monitoring solution for cloud hosted applications, announced RPM Version 2 with support for Java applications (previously New Relic had focused only on Ruby-on-Rails applications), and a new on demand pricing model that mirrors the by the click pricing used by many public cloud vendors. RPM Version 2 includes the following key enhancements:

  • The industry’s first cross-platform on-demand application performance management and root-cause diagnosis tools for Rails and Java; includes support for applications developed on WebSphere, WebLogic, JBoss, Tomcat, Jetty and Glassfish and frameworks such as Spring, Grails, and JEE.
  • Scenario-based workflows to facilitate problem solving; Users can go from application overview to code-level diagnosis in three clicks.
  • Optional on-demand pricing as an alternative to fixed monthly or annual fees, enabling pay-as-you-go flexibility.
  • A re-designed intuitive interface that supports monitoring, application troubleshooting and tuning, root cause diagnosis and proactive planning.
  • Long-term SLA reporting to spot performance trends and improve communication of application performance to business managers.

dynatrace Introduces Virtualization Aware Transaction Centric APM

dynaTrace became the first APM vendor to specifically tune their agent based J2EE monitor for accurate response time measurements in virtual environments. This is a huge step for agent based deep dive J2EE monitors as it involves taking advantage of a new pseudo-performance counter API from VMware to counter the effects of virtualization upon being able to take accurate time based measurements within a VM. dynaTrace is used to monitor some very important tier 1 applications in some very large enterprises, and the ability for their product to work correctly as these applications get virtualized will only accelerate the migration of these tier 1 applications to virtual infrastructures.

OPNET Introduces ACE Live VMon for APM in Virtualized Environments

OPNET an APM vendor with a combination of a network/HTTP appliance and a J2EE/.Net deep dive agent approach to APM announced a virtual appliance version of its network appliance. This solution, like AppSpeed captures HTTP (and other network protocol) transaction response times from the virtual mirror port on the vSwitch. This data is then combined with data collected by an agent in the J2EE or .Net applications servers. This is an extremely interesting solution for two reasons. The first is that it combines the network end user experience perspective from either a virtual or physical appliance with deep dive method performance data from a J2EE or .Net agent. The second is that this solution easily spans applications systems that are partially physical and partially virtual – a scenario that is likely to be the norm as the first parts of tier 1 applications are virtualized, while the remainder of the application stays on physical hardware.

Optier Augments Business Transaction Monitoring with End User Experience Monitoring

In September, Optier a vendor who has pioneered the Business Transaction Management category announced the addition of an HTTP monitoring appliance to it CoreFirst transaction monitoring solution. This is a significant announcement from two perspectives. The first is that Optier has done a very nice job of integrating HTTP response time data for web based applications into the same management console that is used by CoreFirst to monitor the performance of individual transactions across the entire applications system. The second is that since CoreFirst is used by some of the very largest enterprises to monitor the most business critical and performance critical applications, this new HTTP appliance puts Optier into a position to easily capture that HTTP performance data (by moving the appliance to a virtual appliance) if and when those applications start to get virtualized.

BlueStripe Announced FactFinder 3.0

BlueStripe announced FactFinder 3.0. FactFinder is based upon a agent that lives in the physical or virtualized OS, and which maps the application flows and does hop-by-hop and total response time measurement. FactFinder is unique that it works across physical and virtual operating systems and the applications that run on those operating systems, and that it is also completely agnostic as to which OS (FactFinder supports Windows and Linux) and which virtualization platform (FactFinder does not care if it is VMware, Hyper-V, Xen or KVM) the application is running on. The key new features of BlueStripe FactFinder 3.0 are:

  • Automatic Application System Identification: In 10 minutes, FactFinder, using AppSmart Technology, automatically maps application transaction systems and highlights discrete systems within the IT infrastructure, naming URL sets, hostnames and app pools for monitoring and fast triage.  System dependency maps measure real-time load curves for proactive capacity planning.
  • Application Transaction System Monitoring: Provides a dashboard of multiple applications, including URL-level visibility and hop-by-hop service request response times at every level across application tiers.
  • Rewind Problem Replay: In addition to FactFinder’s existing real-time & historical problem analysis, FactFinder’s new Rewind capability gives a “black-box flight recorder” for application server platforms, enabling complete replay of application problems for systems not already under monitoring.
  • Enterprise Fleet Scaling: Scales to thousands of servers across physical, virtual and cloud platforms, and drills down into individual application systems of up to 250 servers for complete visibility of response times and component usage across an enterprise.  Federates multiple FactFinder deployments for problem solving collaboration.

Summary Analysis

When the first phase of applications got virtualized, the focus was upon making sure that the host servers did not run out of resources and that the applications were available. Vendors like Veeam, Vizioncore, eG Innovations, and Uptime software have provided excellent and cost effective solutions to this problem. The next phase of innovation was to understand the actual performance (from a response time perspective) of the infrastructure a problem that currently being addressed by Akorri and Virtual Instruments and which NetQos clearly has its sights on.

However, in order to virtualize business critical, high transaction rate, and performance critical applications, an understanding of infrastructure performance is not enough. What is critical is for the teams that support these applications in production to have tools that work across OS platforms, across virtualization platforms, across physical and virtual infrastructures and which provide layer by layer mapping and hop-by-hop response time. These new tools provide this level of insight into applications performance across virtual and physical infrastrutures and in some cases across virtualization platform. Enterprises looking to make an APM decision for their emerging set of Tier 1 applications are urged to use the following process:

  1. Decide on the number and type of Tier 1 applications that you are going to virtualize over the course of the next 24 months. A small number of homogeneous J2EE applications might lead you to consider dynaTrace or OPNET.  A large number of heterogeneous applications might lead you to look at BlueStripe.
  2. Decide if you want a tool that spans physical and virtual infrastructures. AppSpeed is a fine tool if the entire application except the database servers is virtualized and your applications only use the protocols that AppSpeed recognizes. More complex cases need to be handled by tools that more fully support a physical infrastructure like dynaTrace, OPNET, and BlueStripe.
  3. Decide if you are going to end up with more than one virtualization platform, and if so if you want an APM tool that will work across your virtualization platforms no matter which ones you choose.  BlueStripe is the most virtualization platform agnostic tool in the category and is worth looking at in this regard.
  4. Decide if the public cloud is going to be a factor in how you deploy your applications. Most APM tools are not built for easy deployment in public cloud scenarios. New Relic (due to its SAAS model) is the exception in this regard and should be given a strong look for any business critical application (or portion thereof) that will be public cloud based.
  5. Pay attention to Optier. Optier manages end to end business transactions for some of the most important applications in some of the most high profile enterprises. Optier does not currently have a virtualization aware offering, but as their customers start to migrate the applications monitored by Optier to a virtualized infrastructure Optier is certain to add these capabilities. Another way to look at this is that when Optier delivers a vSphere aware product, we will know that VMware has succeeded in getting these most critical applications onto its platform.

For a complete review of the Infrastructure Performance, Applications Performance, and Transactions Performance Management requirements and vendor solutions, please download the white paper below.

Posted in IT as a Service, SDDC & Hybrid CloudTagged , , , , ,