Why Is Application Performance Management So Screwed Up?

Ask yourself this very simple set of questions. How many applications does your company have that warrant management on an availability, response time, and integrity of service basis? For how many of those applications do you have a functional Application Performance Management (APM)  solution in place that actually allows you to measure and guarantee availability, response time and integrity of service?

If you are like most enterprises you might have 500 or 1000 applications that meet the above criteria, but you probably only have 25 or 50 of your most important applications under management. Why is this:

  1. Many enterprises have confused (with vendor help) the notion of monitoring the resources that an application uses with its performance. Users of an application do not call up and say, “the application is using too much memory and my quality of service is awful”. They call up and say, “it is slow”. Slow means response time is awful.
  2. If you have deployed APM tools from first generation APM vendors like CA (Wily), HP (HP Diagnostics), IBM (Tivoli ITCAM) what you have likely found is that these products are infinitely complex to install, configure and make work and that they require large amounts of vendor provided professional services in order to make them be able to monitor an application in production.
  3. These products are completely out of step with how applications are deployed today. Years ago applications got deployed on a few high end Weblogic or WebSphere applications servers and you were happy to pay $20,000 per server to monitor these applications. Today applications are deployed on scaled out commodity hardware and middleware, and you are not going to pay 4X what it cost to deploy the entire applications server (hardware and software) just to monitor it.
  4. These products are completely out of step with how applications are developed today. Today many organizations use Agile methodologies to roll changes into production in a monthly or even weekly basis. Legacy APM products require manual actions every time that the application changes. This renders them irrelevant in rapidly changing environments.
  5. Legacy APM products are clueless when it comes to distributed data center deployments and the cloud. Their communications model assumes that the management system can poll the agents inside of the JVM’s or .NET CLR over a subnet. That means you cannot deploy a management system inside of your data canter and manage agents that happen to be running in an outsourced data center or a cloud.
  6. There is tremendous innovation in applications platforms, but APM products have for the most part not kept up. Developers are under constant pressure to produce more business functionality in code, but have less time and less money to do it. This means that new platforms like Ruby, and PHP may be as important to you as Java and the .NET languages.
  7. APM vendors have largely focused upon selling to the people who develop and have to support custom developed applications in house. This leaves out the question of how to support all of the purchased applications that underlie custom developed code. For example it is not at all unusual to see custom developed applications layered on top of purchased applications like SAP, Oracle Applications or PeopleSoft.
  8. Legacy APM tools are completely inappropriate for monitoring applications that run on dynamic platforms like VMware, or other virtualization stacks. The reason for this is that they either rely upon resource utilization metrics as a proxy for performance, or they cannot automatically adapt to changes in the environment that supports the application.
So What Do Do?
The first step is to embrace a new set of requirements for APM solutions:
  1. The APM solution(s) that you purchase should be able to address ALL of the business and performance critical applications in your enterprise. Now this may require more than one product, but you should start the process with a list of all of your important applications, whether they are purchased or custom developed, and how they are architected.
  2. You should clearly define what the objective of the APM solution is for you. Are you trying to more rapidly fix bugs in production for a custom developed application, or are you trying to support a purchased application in production (with no access to the underlying code).
  3. Points #1 and #2 above demand a trade-off between depth and breadth. You can easily get a first class APM solution that supports Java and .NET and that gives  you deep dive analysis into the application stack and the database calls. You cannot get this for every application that you own, since not every application that you own is written to Java and or .NET.
  4. Consider the architectural nature of your application. If your application is deployed across multiple tiers of dynamically scaled out servers, then you need something that can discover transaction flows across that mesh network of servers, and trace load and response time in the process. First generation APM tools cannot and will not be able to do this for you.
  5. Consider your development process if you are building the most important applications for yourself. If you are rapidly changing code in production, then you need an APM tool  that can automatically keep up and not one that requires any manual configuration any time you add a new transaction, object or method.
  6. Consider where every part of your application is going to be running. It used to be everything ran in the four walls of your data center. Now parts of it might be outsourced, and parts of it might run on a scale up or scale down basis in a public cloud.
  7. Think about the price and the long term cost of ownership of the solution. It is easy to buy something that has a consultant in the box. If it does, send it back. If you are deploying your application on commodity hardware and commodity (open source) applications stacks like Tomcat, VMware vFabric, Red Hat JBoss applications Server, and a free version of Linux, you should not pay more per server to manage the application running on it than you paid for the hardware and software platform to begin with.
Less than 5% of the applications that matter to enterprises world wide are under management by an APM solution that can help ensure application response time, application availability and the integrity of the critical transactions within the application. This is because first generation APM solutions have been too expensive to purchase, too limited in their scope and too expensive to configure, maintain and own.

Leave a Reply

12 Comments on "Why Is Application Performance Management So Screwed Up?"

newest oldest most voted

SNMP was invented for monitoring. But must vendors ignore it to cook their own soup in order to be able to sell their overexpensive management suits. Integrated monitoring could be much simpler!


SNMP was invented to monitor network devices. It does nothing to help monitor true applications performance (response time). It can tell you data throughput through a switch, and it can tell you how busy the switch is, but it cannot even tell you the latency that the network in noticing or inducing. Furthermore when the switch gets busy, many of them stop sending the SNMP data, which means that SNMP ceases being useful at the exact time that you need it most.


Great blog post – I would add one additional thought. What about Mobile? Where are the tools to enable BTM (Business Transaction Monitoring) on Mobile devices?

You are absolutely correct. The proliferation of end user devices (beyond just Windows PC’s) makes this problem exponentially harder, and makes the first generation APM solutions even more useless. This will not get fixed until there are either installable or downloadable agents for Android and the Apple platform that do what first class agents do in today’s Java and .NET platforms. If you are an c-commerce provider with customers on mobile platforms you need to be thinking about this problem from an end to end transaction tracing perspective. This pushes the envelope in APM way beyond the legacy first generation… Read more »
Bernd’s article nailed it. Static monitoring solutions don’t work for dynamic IT and none of the vendors mentioned are anywhere close to addressing the problem. If IT wants to v-motion up additional capacity at the web tier but the monitoring team delays them for 2 days to write static monitors, there is definitely a problem. If it takes longer to rev my monitoring solution than it takes to run an Agile development sprint, there is a problem. The big four monitoring vendors continue to make their suites more and more complicated by acquiring new technology but never doing the required… Read more »

My startup is working on such a product. The goal is to provide the same visibility APM does and create the same tools that help developers the same way APM/BPM would.

It is currently under development.
Thank you,
Dor Juravski


Hi Bernd,

Can you please clarify when an APM solution is being deemed first generation and when it “graduates” to being second generation?



[…] Harzog’s post Why is Application Performance Management so Screwed Up? started a lot of discussions on the Internet. The post is a very good list of existing issues you […]


Hi Leendert,

Just have a look at criteria 1 through 7 in the post. 1st generation APM solutions in general meet none or few of these criteria. The innovators in this space (AppDynamics, BlueStripe, New Relic, dynaTrace, ExtraHop) all address depth, breadth, zero config, and acceptable overhead in production in creative ways.

The bottom line is simple. When virtualizing start by throwing out every legacy systems management product you own.




I need some help to find out if there is any tool that can give below stats for power builder application running against oracle and sybase. Also highlight if I missed any other areas to be monitored.

1) Distribution of entire client event time into time consumed by application, network and database.

2) Details on how much time spend in app and which application function/method is root cause

3) # of DB requests made

4) DB IO stats

5) CPU time per db request

4) DB server memory usage

5) Size of data transferred between DB and app server


[…] For further discussion of why most IT organizations only monitor approximately five percent of their application portfolio, we recommend this post from Bernd Harzog of The Virtualization Practice: Why Is Application Performance Management So Screwed Up? […]

News: VMware vFabric APM End of Life | IBM Watson Cloud Computing

[…] Why is Application Performance Management so Screwed Up? […]