As a delegate for Tech Field Day 6 in Boston, I was introduced to many third party management tools. In the past I have been given briefings as well on various VMware, Hyper-V, and Citrix Xen Management Tools as well. Many of these tools are marketed directly for use by the administrator, but they have the tools can be used by more than the administrator. These tools should be marketed to management, administrators, as well as the network operations center (NOC). The NOC you say, why should they see the details of my environment? The NOC should not, but they should be able to tell when systems are in failure states outside of the hardware. Only a few tools can be used this way today. The sooner administrators get the word of a problem the sooner it can be fixed. The NOC is the one place that centralizes all monitoring whether it is for security or health of your virtual and cloud environments.At Tech Field Day some of us wanted more or better integration with VMware vCenter or any hypervisor management servers. However, I wanted something I could view from the NOC that would tell me the health of the system either as a component of another dashboard or as their own. Preferably part of some other mashup as there are too many dashboards. So which allow me to mashup within other dashboards?
- Solarwinds VMAN objects can be accessed by simply using a specialized URL which can be called from others as seen on Tech Field Day 6.
- Reflex Systems has a portlet convention to see management as well as security data as seen in past briefings.
The other tools we saw, are their own dashboards and do not yet support the concept, but they can all email off a report of information. However, a report may be too late. How is this? Because any issue could mask a security issue, and as such we need continual monitoring and auditing capability. A NOC setup provides us with this level of monitoring day and night. In addition, we need tools that can see the patterns that appear within any virtual environment to know what is considered healthy and what is not. Into this mix would fit tools like VMware vCenter Operations, Appspeed, Solarwinds VMAN, vKernel, and others that have self-learning capabilities.
The vendors need to think outside the box and embrace such environments as the network operations center and provide a method to mashup their ‘dashboards’ within a super-dashboard that is easy to understand, see, and conclude that there is a point in time issue, or a change in the daily operational pattern. We need to be notified of issues as soon as they are detected with time to detection down to 5 minutes or less, not on the time that reports are typically generated. While the reports are great for managers and the like, they do not help the Network Operations Center and for catching problems as close to when they happen as possible.
The latter is very important for forensics and problem determination as by the time we normally realize there is a problem the data for that problem may be well and truly gone which makes forensic analysis and problem determination just about impossible. If the NOC had the proper tools and responded quickly to such issues, those who know how to work a problem would be notified 24×7 365 days a year. This would be your administrators as well as the incident response team depending on the failure.
When this data is used by the NOC written procedures will need to be created and updated for how to handle all the new events and data. Which are security related (possibly all), how to tell, and whom to notify on any incident. These are required changes as more mashups of dashboards and more dashboards for virtualization are placed into any NOC.