In “VMware Articulates a Compelling Management Vision – Automated Service Assurance“, we gave credit to VMware for (finally) articulating why their management stack was going to be different and better than every management stack that we have known and hated in the physical world. That difference was that VMware was going to focus upon automated remediation of problems instead of “monitor, alert, and then fix manually”.
We noted that while it was VMware’s objective to guarantee the performance of applications, the existing and announced at that time capabilities in this regard focused upon guaranteeing resources to key applications and simply eliminating the “noisy neighbor” problem. We made the point that if you want to guarantee the performance of an application the first thing you have to do is measure the performance of the application, and that application performance means response time, not how much CPU, memory, network, or storage I/O resources it is using.
This set the stage for the integration of Application Performance Management (APM) into the question of how to automatically assure the performance of key applications running on dynamic infrastructures. In “Why is Application Performance Management so Screwed Up” we detailed what is wrong with first generation (legacy) APM solutions, and why they cannot be used to assure the performance of applications on a dynamic infrastructure. The short answer is legacy APM solutions from vendors like IBM, CA, BMC, and HP are neither a technical fit (they do not have the required features), nor an economic fit (they are priced and packaged incorrectly, and they require way to much manual configuration and consulting to make work).
So Who is Reinventing APM?
If legacy solutions do not have the right features, are too expensive to buy, too expensive to make work, and cannot keep up with dynamic environments, then who is leading the charge to address these new requirements. The good news, is that a set of very competent and will financed companies have taken on this challenge and they are now joined by VMware.
New Relic. New Relic is a classic case of someone being successful doing something one way, and then realizing that there is a better way, and succeeding the second time in the opposite way. New Relic is founded and run by Lew Cirne who founded Wily which was acquired by CA in 2005. CA Introscope was and still is the market leading first generation APM solution characterized by high prices, long implementation times with lots of vendor provided consulting, a high cost of ownership, and a constant need to tweak the tool as the application changed. With New Relic Lew Cirne turned this model on it head. New Relic is offered only on a SaaS basis, you do not even have to install the back end. You just install the application in your PHP, Java, .NET, or Ruby application, point the agents at the New Relic back end and then log onto the web console. Pricing is on a simple monthly subscription basis.
AppDynamics. One of the key software architects at Wily, Jyoti Bansal founded AppDynamics. His key insight was that first generation APM products made it to hard to monitor applications in production. First generation products required you to make difficult tradeoffs between how comprehensively to monitor every object and method in your stack, and how much overhead you were willing to tolerate. This lead to the aforementioned constant need to tweak. What AppDynamics built instead is an APM product that you install in your Java or .NET application. It then discovers the low level transactions in your application, traces them through the network of servers to the database and back again, and gives you detailed diagnostics when things go wrong. All without any configuration or tweaking. AppDynamics also includes very strong orchestration capabilities giving you an easy to use U/I out of which you can take automated actions based upon events in AppDynamics. This will prove to be a crucial capability in the coming integration of APM solutions with virtualization platforms like vSphere.
BlueStripe. Also founded by two ex-Wily executives, Vic Nyman and Chris Neal (are you starting to detect a patter here?), BlueStripe is also based upon a key insight. The key insight was that not all applications that enterprises counted upon were custom developed (so the audience for an APM solution is not always a developer who can change code), and that applications have been written in so many ways that an APM solution dependent upon a particular set of application run times will be unable to cover the waterfront that exists in many enterprises. BlueStripe therefore built an APM solution that works for any Windows, Linux, AIX, or Sun OS based application that that automatically discovers the application topology, and that automatically calculates hop-by-hop and end-to-end response time. BlueStripe also that has deep integration with the vSphere API’s – making it the only APM solution that is fully virtualization platform aware at this point in time.
Confio Software. A recent entry into the APM for virtualized application space, Confio focuses upon the performance of virtualized databases, and the relationship between database performance and the performance of the underlying virtualization infrastructure including the storage arrays. Confio is uniquely able to cross-correlate storage latency and database performance making it possible to virtualize database servers where there had previously been reluctance to do so.
dynaTrace (recently acquired by Compuware). dynaTrace has pioneered the ability to trace transactions from their inception in the user’s browser through all of the layers of the application system, to the database and back again to the end user. dynaTrace is the only solution that can do this in production with acceptable overhead for latency sensitive e-commerce applications. Now that dynaTrace is part of Compuware, it is part of a broad portfolio of APM assets including all of the outside-in and synthetic transaction testing capabilities that Compuware previously acquired when it bought Gomez.
ExtraHop Networks. ExtraHop takes a network insight approach to APM. It comes in a physical appliance that gets attached to a mirror or span port on a physical switch in the network, and it also comes in a virtual appliance that gets attached to the virtual mirror/span port on the VMware vSphere vSwitch. The issue with network approaches have always been that they lacked deep insight into what was really going on with the applications, and they had a hard time presenting response time numbers for applications that really made sense. ExtraHop has addressed both of these issues. The issue of application visibility has been addressed by specifically cracking into almost all of the relevant layer 7 protocols so as to provide visibility into, for example, individual SQL queries. The response time issue has been addressed by ground-breaking features that can actually assemble a stream of TCP/IP request/responses in to transactions that actually map to the real transactions going on in the higher levels of the stack.
VMware. With vFabric Application Performance Manager, the great news for believers in VMware’s vision of automated management is that VMware is going to try again on the APM front. The first try, AppSpeed failed because it did not go as deep as the best “code aware” APM solutions, and did not cover the waterfront of all applications as a compensating capability. vFabric APM combines the features of AppSpeed (understand the topology and performance of a broad set of applications), with technology from Hyperic (which broadened the set of supported applications), and the Insight technology which provides deep insight into Java based applications in production.
With vFabric APM, VMware is bringing several important innovations to the table in the APM arena and has a very aggressive product enhancement plan to boot:
- A Performance Index for each application. This is a normalized score between 0 and 100 that is based both upon the average response time for the application, and the variability (standard deviation) in that response time. This is a huge step forward for the APM industry, as there has not been any well understood and accepted metric that characterized the performance of an application and that could be used as the basis for a Service Level commitment. Other vendors have implemented AppDex, which is useful to understand end user experience, but it is not useful to understand the end-to-end response time of an application system from the perspective of the application itself.
- Integration of change events into an APM Solution. 80% of the problems in the performance of applications come from changes – either changes in code that get put into production, or changes in the environment that supports the application. Right now vFabric APM focuses upon tracking change events like a new snapshot of the image that comprises the application server. Longer term it would be reasonable for vFabric APM to integrate with the vSphere API events so that changes in the infrastructure could be cross-correlated with changes in the Performance Index.
- Integration with vSphere orchestration. This means that it is entirely possible to write a rule that says that “if X, Y, and Z happen, take action A”. This is a down payment on fulfilling the vision of automated IT management that VMware has articulated. The Performance Index is absolutely the right trigger for automated actions. The work to come is in the area of figuring out which automated actions can be triggered deterministically (through rules), which ones should be triggered based upon statistical analysis (via the Integrien technology), and which ones just have to remain subject to operator approval.
- A focus upon Applications Operations. In “Is it Time to Reorganize Data Center Operations“, we proposed that IT Operations become Virtual Operations, and that a new function needed to come into being to support applications in production, entitled Application Operations. It is essential for IT to form and staff Application Operations teams to own the performance of applications running on virtual infrastructures, because if these teams do not exist and they do not guarantee the Performance Index for the most important applications enterprises will not be able to successfully virtualize important applications.
- A focus on Price Performance. We are rapidly approaching a world where an application owner can package their application up as a vApp, and then engage in price performance negotiations and tradeoffs with the internal IT organization, and a variety of external service providers. Three pieces of information are needed to make these comparisons – the Performance Index for an application, its transaction load, and the price for the performance/load curve required for the application.