Application Performance Management (APM) solutions are historically monolithic systems used by IT operations for monitoring the performance of production applications. But this trend is changing quickly. A combination of agile and DevOps methods combined with cloud computing and a new generation of DevOps focused APM products are adding value throughout the agile development lifecycle – far beyond application support.
Legacy APM tools are generally expensive, complex, require lots of configuration to get working and are fragile when application ecosystems change. They were designed for applications running on physical hardware that rarely change – the antithesis of agile development in the cloud. These new breed of APM products including AppDynamics, New Relic, dynaTrace, Foglight and VMware vFabric APM amongst others still do what their predecessors did, but now they’re increasingly being embraced by developers, testers and DevOps team members. These people are using APM to add value during analysis, design, development and testing phases too.
This article highlights specific ways agile teams are using APM outside IT operations, starting from a project’s kickoff.
#1 – Kickoff
As a part of starting a new project or product, many agile teams do a Sprint 0. This Sprint is primarily focused on technical tasks and planning so that new feature development can begin in Sprint 1. While the product team is focused on writing the initial user stories, the technical team can focus on hooking up APM to their application while integration is still simple. This, along with other stories for continuous integration, code management can be done.
These steps might be largely manual at first or if they occur rarely. However, if setting up new applications or projects is a common than these can be largely automated. Tools like Puppet can script the installation of the controller and/or agents as a part of environment setup. Agent configuration file(s) can be packaged with application frameworks so new applications are automatically configured to communicate with the controller.
If you’re developing in a PaaS solution such as Heroku or Azure, integrating a new application with APM is even simpler. Just enable in your PaaS management dashboard and you’re set.
#2 – Analysis
Most agile teams capture functional requirements in the user story format: As a <Persona>, I would like to <Do Something>, so that I can <Achieve Some Result>
Along with each user story are a set of acceptance criteria that defines specifies when a user story is “done”. Often acceptance criteria may include non-functional requirements related to performance, scalability and resiliency. Teams often struggle with how to test these during development, and area where APM can help.
Defining a non-functional requirement can be as simple as as the example above. The method alone clearly stakes which APM metric from which dashboard, under what amount of load and for what percentage of users. This clarity of requirement not only sets expectations with internal stakeholders including architects and developers, it’s also something that can be easily integrated into a business-facing dashboard built in the APM product during development and available after release.
#3 – Design & Development
During development APM is a supplemental tool to the team, not something necessarily used everyday but a very handy tool for running down issues, such as why a particular test failed or why page load times are slow for single users. They can also identify poor design decisions and raise them to the surface, something that occurred on a recent project I audited.
Two companies were charged with building a new customer service web site for a client. One focused on the CMS and web application. The other on creating web services to expose data in legacy systems to customers. During design the legacy team insisted on creating fine grained web services because it was the least expensive option and easiest to do. The project manager agreed and teams moved forward. The web team’s user story requirements called for a year’s worth of transaction data to be displayed on the web page. The web team implemented the story, making the right API calls, and the functional tests passed.
Soon thereafter developers and testers started complaining about the slow load times. They hooked up their APM tool and discovered the offending page was making 117 web service calls to get the required data to load the page. Although each call was less than 400 milliseconds, the sheer number made the performance horrible. The web team brought this information to the legacy team’s attention, showed them the APM data. Then they quickly worked out a single API call that took additional parameters but made the integration much simpler and faster. Once each team refactored their code the page load time dropped dramatically.
Solving integration problems like these are a part of a developer’s life. Without the right tooling, some of these problems may take days or weeks to resolve and potentially result in a whole bunch of hand-written diagnostic code.
The new breed of APM tools get this reality and are increasingly focusing their attention on developers and not just operations. In addition to free developer versions of their products, APM vendors are forming partnerships with PaaS vendors (such as New Relic with Heroku and AppDynamics with Azure) to make integrating monitoring into your applications very simple.
#4 – Functional Testing
Agile teams leverage automated testing as a normal part of their sprints. While developers focus on automated unit testing, quality assurance typically focuses on integration and acceptance testing. During each sprint APM is hooked up to the application in the test environment(s). If one of the automated tests fail at a specific time, the corresponding snapshot of the failed transactions at that same time can be captured from APM to enable further analysis. Tools like Splunk can also help here as well, enabling developers and testers to collaboratively solve issues uncovered by testing – especially tough ones such as bugs that are only reproducible under certain conditions.
#5 – Performance Testing
This is one of the most popular uses of APM as it helps architects, developers and testers answer questions such as: what really happens to the application under load and can the application support the customer demand?
Modern APM’s are essentially application profilers that have such a low overhead they can run all the time without negatively impacting an application’s performance. Gone are the days of hooking up an application profiler, running tests and having your results skewed greatly because of their invasive overhead. Today’s APM tools give developers the same drill down capabilities – such as identifying the problem line of code or SQL statement – that profilers traditionally provided but without the overhead and extra setup.
What this means is that during performance tests, a team can in real-time watch the application’s performance under load and diagnose issues on the fly. They can also save off results for post-analysis. APM’s are useful for clearly identifying bottlenecks and limitations on scalability. They bring these issues to the attention of the team who are in the best position to fix them, whether this be tweaking a configuration parameter or refactoring code. They’re also useful for recording previous performance test runs so teams can do comparisons between releases to look for any subtle trends.
It’s no secret that faster applications generate more revenue and better customer experience. Amazon notes that every 100 millisecond drop in response times yields a 1% sales decline (that’s a $200M+ potential revenue impact). Google notes that a 500 millisecond drop in response times results in 20% less search traffic. This means the new APM tools are delivering value not only by reducing resolution times (costs), but also improving performance (revenues) – a great position for any product.
#6 – Production Deployment
APM is particularly helpful during new releases to production, which can be nerve-racking events themselves. Most vendors have a way to indicate a change to production such that post-release metrics (such as response time) can be compared with pre-release metrics. Should something look amiss or there’s a performance problem identified, the decision can be made to quickly rollback and investigate. Data from the APM tool can be used as a part of this analysis to figure out what went wrong before attempting the next release.
This same basic process can be applied to teams practicing continuous deployment. But instead of relying on humans to do the release analysis, this is automated so only exceptions are raised to the attention of humans, otherwise the same post-launch checks are all done via automated tests and validated in part using APM. Should issues arise, workflow scripts can be created to send issues to the organizations incident management system. A table of popular DevOps competent APM tools for use in dynamic and cloud based environments is below.
DevOps Focused APM Tools
|Vendor/Product||Product Focus||Deployment Method||Data Collection Method||Supported App Types||Application Topology Discovery||Cloud Ready||“Zero- Config”||Deep Code Diagnostics|
|AppDynamics||Monitor custom developed Java and .NET applications across internal and external (cloud) deployments||On Premise/SaaS||Agent inside of the Java JVM or the .NET CLR||Java/.NET|
|dynaTrace (Compuware)||Monitoring of complex enteprise applicatons that are based on Java or .NET but which may include complex enterprise middleware like IBM MQ and CICS||On Premise||Agent inside of the Java JVM or the .NET CLR||Java/.NET, Websphere Message Broker CICS, C/C++|
|New Relic RPM||Monitor custom developed Java, .NET, Ruby, Python, and PHP applications across internal and external (cloud) deployments||SaaS||Agent inside of the Java JVM, NET CLR, or the PHP/Python runtime||Ruby/Java/ .NET/PHP/Python|
|Quest Foglight||Monitor custom developed Java and .NET applications and trace transactions across all physical and virtual tiers of the application||On-Premise||Agent inside of the Java JVM or the .NET CLR||Java/.NET|
|VMware vFabric APM||Monitor custom developed Java applications in production. Strong integration with the rest of the VMware product line including automated remediation and scaling.||On Premise||Mirror port on the vSphere vSwitch and an agent inside the Java JVM||HTTP/Java/.NET/SQL|
#7 – Support
This is the traditional use case for APM and still the most popular: helping operations teams reduce incident resolutions times. This may be in real-time as an incident occurs or during post-incident analysis looking for clues as to what went wrong. Often times this includes looking into slow or failed transactions to identify root causes. I’ve known teams to use APM to discover the 5am database back-up job is causing application performance to degrade.
From SLA management to operational dashboards, these newer APM tools still support their core operations administrator and help-desk engineer. But with increased simplicity and more intuitive user interfaces, these new APM adding value beyond their traditional support role.
A new breed of DevOps focused APM tools is moving performance management outside the domain of operations. With features to support analysts, architects, developers, testers and DevOps APM is at home in all phases of agile development.