Ask yourself this very simple set of questions. How many applications does your company have that warrant management on an availability, response time, and integrity of service basis? For how many of those applications do you have a functional Application Performance Management (APM)  solution in place that actually allows you to measure and guarantee availability, response time and integrity of service?

If you are like most enterprises you might have 500 or 1000 applications that meet the above criteria, but you probably only have 25 or 50 of your most important applications under management. Why is this:

  1. Many enterprises have confused (with vendor help) the notion of monitoring the resources that an application uses with its performance. Users of an application do not call up and say, “the application is using too much memory and my quality of service is awful”. They call up and say, “it is slow”. Slow means response time is awful.
  2. If you have deployed APM tools from first generation APM vendors like CA (Wily), HP (HP Diagnostics), IBM (Tivoli ITCAM) what you have likely found is that these products are infinitely complex to install, configure and make work and that they require large amounts of vendor provided professional services in order to make them be able to monitor an application in production.
  3. These products are completely out of step with how applications are deployed today. Years ago applications got deployed on a few high end Weblogic or WebSphere applications servers and you were happy to pay $20,000 per server to monitor these applications. Today applications are deployed on scaled out commodity hardware and middleware, and you are not going to pay 4X what it cost to deploy the entire applications server (hardware and software) just to monitor it.
  4. These products are completely out of step with how applications are developed today. Today many organizations use Agile methodologies to roll changes into production in a monthly or even weekly basis. Legacy APM products require manual actions every time that the application changes. This renders them irrelevant in rapidly changing environments.
  5. Legacy APM products are clueless when it comes to distributed data center deployments and the cloud. Their communications model assumes that the management system can poll the agents inside of the JVM’s or .NET CLR over a subnet. That means you cannot deploy a management system inside of your data canter and manage agents that happen to be running in an outsourced data center or a cloud.
  6. There is tremendous innovation in applications platforms, but APM products have for the most part not kept up. Developers are under constant pressure to produce more business functionality in code, but have less time and less money to do it. This means that new platforms like Ruby, and PHP may be as important to you as Java and the .NET languages.
  7. APM vendors have largely focused upon selling to the people who develop and have to support custom developed applications in house. This leaves out the question of how to support all of the purchased applications that underlie custom developed code. For example it is not at all unusual to see custom developed applications layered on top of purchased applications like SAP, Oracle Applications or PeopleSoft.
  8. Legacy APM tools are completely inappropriate for monitoring applications that run on dynamic platforms like VMware, or other virtualization stacks. The reason for this is that they either rely upon resource utilization metrics as a proxy for performance, or they cannot automatically adapt to changes in the environment that supports the application.
So What Do Do?
The first step is to embrace a new set of requirements for APM solutions:
  1. The APM solution(s) that you purchase should be able to address ALL of the business and performance critical applications in your enterprise. Now this may require more than one product, but you should start the process with a list of all of your important applications, whether they are purchased or custom developed, and how they are architected.
  2. You should clearly define what the objective of the APM solution is for you. Are you trying to more rapidly fix bugs in production for a custom developed application, or are you trying to support a purchased application in production (with no access to the underlying code).
  3. Points #1 and #2 above demand a trade-off between depth and breadth. You can easily get a first class APM solution that supports Java and .NET and that gives  you deep dive analysis into the application stack and the database calls. You cannot get this for every application that you own, since not every application that you own is written to Java and or .NET.
  4. Consider the architectural nature of your application. If your application is deployed across multiple tiers of dynamically scaled out servers, then you need something that can discover transaction flows across that mesh network of servers, and trace load and response time in the process. First generation APM tools cannot and will not be able to do this for you.
  5. Consider your development process if you are building the most important applications for yourself. If you are rapidly changing code in production, then you need an APM tool  that can automatically keep up and not one that requires any manual configuration any time you add a new transaction, object or method.
  6. Consider where every part of your application is going to be running. It used to be everything ran in the four walls of your data center. Now parts of it might be outsourced, and parts of it might run on a scale up or scale down basis in a public cloud.
  7. Think about the price and the long term cost of ownership of the solution. It is easy to buy something that has a consultant in the box. If it does, send it back. If you are deploying your application on commodity hardware and commodity (open source) applications stacks like Tomcat, VMware vFabric, Red Hat JBoss applications Server, and a free version of Linux, you should not pay more per server to manage the application running on it than you paid for the hardware and software platform to begin with.
Conclusion
Less than 5% of the applications that matter to enterprises world wide are under management by an APM solution that can help ensure application response time, application availability and the integrity of the critical transactions within the application. This is because first generation APM solutions have been too expensive to purchase, too limited in their scope and too expensive to configure, maintain and own.

Share this Article:

Share Button
Bernd Harzog (332 Posts)

Bernd Harzog is the Analyst at The Virtualization Practice for Performance and Capacity Management and IT as a Service (Private Cloud).

Bernd is also the CEO and founder of APM Experts a company that provides strategic marketing services to vendors in the virtualization performance management, and application performance management markets.

Prior to these two companies, Bernd was the CEO of RTO Software, the VP Products at Netuitive, a General Manager at Xcellenet, and Research Director for Systems Software at Gartner Group. Bernd has an MBA in Marketing from the University of Chicago.

Connect with Bernd Harzog:


Related Posts:

12 comments for “Why Is Application Performance Management So Screwed Up?

  1. Netzwerg
    October 11, 2011 at 1:49 PM

    SNMP was invented for monitoring. But must vendors ignore it to cook their own soup in order to be able to sell their overexpensive management suits. Integrated monitoring could be much simpler!

  2. Bharzog
    October 11, 2011 at 1:54 PM

    SNMP was invented to monitor network devices. It does nothing to help monitor true applications performance (response time). It can tell you data throughput through a switch, and it can tell you how busy the switch is, but it cannot even tell you the latency that the network in noticing or inducing. Furthermore when the switch gets busy, many of them stop sending the SNMP data, which means that SNMP ceases being useful at the exact time that you need it most.

  3. October 11, 2011 at 2:28 PM

    Great blog post – I would add one additional thought. What about Mobile? Where are the tools to enable BTM (Business Transaction Monitoring) on Mobile devices?

  4. Bharzog
    October 11, 2011 at 6:56 PM

    You are absolutely correct. The proliferation of end user devices (beyond just Windows PC’s) makes this problem exponentially harder, and makes the first generation APM solutions even more useless. This will not get fixed until there are either installable or downloadable agents for Android and the Apple platform that do what first class agents do in today’s Java and .NET platforms. If you are an c-commerce provider with customers on mobile platforms you need to be thinking about this problem from an end to end transaction tracing perspective. This pushes the envelope in APM way beyond the legacy first generation vendors and into startup territory.

  5. October 13, 2011 at 9:46 AM

    Bernd’s article nailed it. Static monitoring solutions don’t work for dynamic IT and none of the vendors mentioned are anywhere close to addressing the problem. If IT wants to v-motion up additional capacity at the web tier but the monitoring team delays them for 2 days to write static monitors, there is definitely a problem. If it takes longer to rev my monitoring solution than it takes to run an Agile development sprint, there is a problem.

    The big four monitoring vendors continue to make their suites more and more complicated by acquiring new technology but never doing the required work to offer a truly integrated means of looking at an entire application top to bottom, tier to tier, datacenter to datacenter. Monitoring a business critical application means analyzing the performance of every element (network, web, app, db, auth, etc.).

    The goal of monitoring should be to see through the massive complexity involved in supporting large scale apps but when a vendor proposes a suite of 15-20 tools and teams of consultants to install, test, configure, and write correlation engines to make sense of all the disconnected data, they are ultimately adding to the complexity.

    If you will excuse our self promotion, ExtraHop has taken a huge stride towards solving the problem. Massive scale, dynamic monitoring (no teams of consultants — its actually fully operational within an hour!), auto-discovery of physical and virtual devices as they are spun up on the network, and full layer 2-7 analysis in the cloud or in your data center. Our customers love us. Have you heard that claim from any of the big four monitoring vendors lately?

  6. October 13, 2011 at 10:42 AM

    My startup is working on such a product. The goal is to provide the same visibility APM does and create the same tools that help developers the same way APM/BPM would.

    It is currently under development.
    Thank you,
    Dor Juravski

  7. Leendert Meyer
    October 17, 2011 at 7:27 AM

    Hi Bernd,

    Can you please clarify when an APM solution is being deemed first generation and when it “graduates” to being second generation?

    Thanks,
    Leendert.

  8. Bharzog
    October 19, 2011 at 7:33 PM

    Hi Leendert,

    Just have a look at criteria 1 through 7 in the post. 1st generation APM solutions in general meet none or few of these criteria. The innovators in this space (AppDynamics, BlueStripe, New Relic, dynaTrace, ExtraHop) all address depth, breadth, zero config, and acceptable overhead in production in creative ways.

    The bottom line is simple. When virtualizing start by throwing out every legacy systems management product you own.

    Cheers,

    Bernd

  9. Ali
    November 17, 2011 at 3:22 AM

    I need some help to find out if there is any tool that can give below stats for power builder application running against oracle and sybase. Also highlight if I missed any other areas to be monitored.

    1) Distribution of entire client event time into time consumed by application, network and database.

    2) Details on how much time spend in app and which application function/method is root cause

    3) # of DB requests made

    4) DB IO stats

    5) CPU time per db request

    4) DB server memory usage

    5) Size of data transferred between DB and app server

Leave a Reply

Your email address will not be published. Required fields are marked *


+ six = 12