As SF author William Gibson said, “The future is already here—it’s just not very evenly distributed.” Some IT infrastructure teams live in a future where they are resolving every issue before there are problems for end users. These teams live in a nirvana where help-desk tickets are all requesting new accounts to be created for staff who start work next week. Phone calls bring praise from line-of-business managers. Personally, I have never seen these IT teams. Maybe they exist; maybe they are just a dream. Many IT infrastructure teams work in a very different world: a world of hurt and pain, where application performance is unpredictable. The help-desk call queue sometimes spirals out of control. When the team is this deep in alligators, it can be hard to see how to drain the swamp. A crucial first step is getting the lay of the land and some idea of where the problems are coming from. The next step is to start dealing with the root causes of issues before they cause problems.
Articles Tagged with Root Cause Analysis
IT operations analytics (ITOA) is the new language that incorporates analytics as a part of IT operations. This is a requirement for today’s environments, as even small labs generate terabytes of data a day: terabytes of logs from applications, network sensors, security devices and products, automation tools, and more. The list of possible streams of data is endless. It is up to the IT operations folks to make sense of this never-ending stream of data. Into this steps analytics. Analytics without knowledge often leads to chasing rabbits down holes, as there can be a large number of false positives.
We all need performance and capacity management tools to fine tune our virtual and cloud environments, but we need them to do more than just tell us there may be problems. Instead, we need them to find root causes for problems, whether those problems are related to code, infrastructure, or security. The new brand of applications, if designed for the cloud à la Netflix, or older technologies instantiated within the cloud need more in order to tell us about their health. Into this breach comes a new set of tools, as well as an existing set of tools.
VMware has made it known for quite some time that virtualization, private clouds (IT as a Service), hybrid clouds, and public clouds will create the need for a new management stack, and that VMware intends to be an aggressive supplier of such a new management stack. However, what VMware has never before said is precisely what would be different about this new management stack (other than it explicitly supporting vSphere) than all of the other management stacks that have existed for all of the other computing platforms in the world.
When VMware announced the three editions of vCenter Operations, VMware sent a couple of very clear messages about how VMware felt that monitoring solutions for vSphere should be constructed. The first message was that VMware views Performance Management and Capacity Management as two sides of the same coin. The second message was that Configuration Management as an essential part of a performance and capacity management solution since so many of the problems are in fact configuration related. The last message was the given the complexity and rate of change in virtualized environments that the interpretation of monitoring data has to be automated with self-learning analytics.