Is the CMDB Irrelevant in a Virtual and Cloud Based World?

Configuration Management Databases (CMDB’s) have been a linchpin of the offerings from the enterprise systems management vendors like CA, IBM, BMC and HP. These products have been marketed as the foundation of both the ITIL framework for management processes, and the Business Service Management frameworks offered by these vendors. While these offerings occupy very important parts of the product strategies from the various vendors who offer them, it is also the case that CMDB’s are enormously expensive to purchase and implement – and due to the time required to implement them have a long time to value for the customer. For these reasons, relatively few enterprise customers have implemented CMDB’s.

Now, along comes virtualization and cloud computing. This article explores what these new developments will have upon the role of the CMDB. The key question is whether or not the CMDB will now become much more important due to the changes created by virtualization and the cloud, or whether the CMDB will be rendered irrelevant by these very changes.

CMDB’s – An Overview

ITIL defines a CMDB as follows:

“A database used to store Configuration Records throughout their Lifecycle. The Configuration Management System maintains one or more CMDBs, and each CMDB stores Attributes of CIs, and Relationships with other CIs.”

A configuration management database (CMDB) is often implemented using standard database technology and typically persists CI lifecycle data as records (or configuration records) in that database. Configuration records are managed according to some data or information model of the IT environment. One of the goals of this specification is to expedite the federated implementation of multiple CMDBs in a single configuration management system.

ITIL defines (in part) a configuration management system as follows:

“A set of tools and databases that are used to manage an IT Service Provider’s Configuration data. The CMS also includes information about Incidents, Problems, Known Errors, Changes and Releases; and may contain data about employees, Suppliers, locations, Business Units, Customers and Users.”

A configuration management system is presumed to be a federation of CMDBs and other management data repositories. The federated CMDB described in this specification is a good match with the database requirements of a configuration management system.

CMDB Overview1

The federated CMDB could support the following scenarios. (However, the scenarios that a federated CMDB supports are left entirely to the discretion of each implementation.):

  • Maintain an accurate picture of IT inventory from a combination of asset information (finance) and deployment/configuration information
  • Reflect changes to IT resources, including asset and licensing data, across all repositories and data sources
  • Compare expected configuration versus actual configuration

Enable version awareness, such as in the following examples:

  • Coordinate planned configuration changes
  • Track change history

Relate configuration and asset data to other data and data sources, such as incident, problem, and service levels. The following are some examples:

  • Integration of change management and incident management with monitoring information
  • SLA incident analysis, by using the service desk and incident

Federated CMDB

Virtualization and Cloud Driven Configuration Management Requirements

A Federated CMDB is supposed to be the place where the data about the assets and their relationship to each other (the configuration of these assets) is stored. It is not necessary that all of this data be stored in one CMDB, but the notion of a federated CMDB means that that CMDB at least knows where all of the data is so that you can query the master CMDB and it can go get the information that is not within that single database.

Virtualization and the Cloud introduce several new challenges to the idea of CMDB as currently specified and offered by the various CMDB vendors. Those challenges are:

  1. A whole new class of data gets created by the virtualization platform – specifically how the virtualization platform itself is configured in support of the guests and the applications that run on the guest.
  2. A whole new set of relationships between the elements in this data get created – specifically new relationships between hosts, hypervisors, guests, virtual networks and virtual storage get created that existing CMDB’s were not built to handle.
  3. New information gets created at a very rapid rate. Hundreds of new guests can get provisioned in time periods much too short to allow for the traditional Extract, Transform and Load processes that feed CMDB’s to be able to keep up.
  4. The environment can change at a rate that existing CMDB’s cannot keep up with. Something as simple as vMotion events can create thousands of configuration changes in a few minutes, something that the entire CMDB architecture is simply not designed to keep up with.
  5. Having portions of IT assets running in a public cloud introduces significant data collection challenges. Leading edge APM vendors like New Relic and AppDynamics have produced APM products that allow these products to collect the data that they need in a cloud friendly way. However, we are still a long way away from having a generic ability to collect the configuration data underlying a cloud based IT infrastructure – notwithstanding the fact that many current cloud vendors would not make this data available to their customers in the first place.
  6. The scope of the CMDB needs to expand beyond just asset and configuration data and incorporate Infrastructure Performance, Applications Performance and Service assurance information in order to be relevant in the virtualization and cloud based worlds.

For the reasons listed above, it is doubtful the today’s enterprise CMDB solutions will make the leap forward into these new dynamic and distributed virtualized and cloud based environments. Rather it is likely that we will again see what we have seen every time a new wave of innovation has swept through IT. New solutions will be built and delivered that address the specific requirements of the new wave. These new solutions will for the most part come from new venture funded startups and perhaps from the virtualization platform companies themselves.

Who and What Will Replace the CMDB?

The first issue that needs to be addressed when it comes to how to create relevant CMDB functionality for virtualized and cloud based environments is where the data is going to come from. The issue with this data is that there are more elements, more relationships and a higher rate of change and therefore transaction rate than existing CMDB solutions are designed to keep up with. These means that a CMDB capability that is relevant to the virtualization and cloud computing worlds must get updated in real time, and be continuously accurate. This in turn means that any attempt to poll the underlying environment (as is done with all existing CMDB’s) is doomed to failure as it will be impossible to poll the environment frequently enough for the configuration management solution to keep up in real time with ongoing configuration changes.

Since VMware vSphere is the market leading virtualization platform that creates these new problems, it is only logical that we look first to VMware both in terms of how it is making data available and how it is using that data itself. VMware has broken significant new ground with the VMsafe API, which is being used to various degrees by vendors in the VMware ecosystem. Some examples of good progress in this area are:

  • Reflex Systems was the first third party vendor to get their driver certified for the VMsafe.net. This allows Reflex to see everything that flows through the hypervisor (which is everything that the hosts and guest are doing). Reflex also can do deep packet inspection to identify the applications running in the guests, and can report configuration changes along with changes in resource utilization patterns. All of this is stored in Reflex’s own database which is accessible by the Reflex VQL query language.
  • ManageIQ has built an extremely robust virtualization management platform, that is based upon deep and broad configuration change detection and configuration policy enforcement as its core functionality. ManageIQ also collects resource utilization data and can cross correlate this data with configuration change data. All of the information collected by ManageIQ is stored in ManageIQ’s own database.
  • Hyper9 uses a search based technology to be able to collect, trend and report on an incredibly wide range of configuration items and to combine these items with views of resource utilization in the virtualized environment. All of the information collected by Hyper9 is stored in Hyper9’s own database.
  • Akorri combines a leadership position in Infrastructure Performance Management with the ability to detect and accept configuration events in the vSphere environment. Akorri is also the only vendor that can cross-correlate Infrastructure Response Time with configuration issues down to the spindle level in the storage arrays. All of Akorri’s information is stored in Akorri’s own database.
  • Xangati provides a very deep view of the local and wide area network based upon data collected via Netflow and from virtual appliances on the VMware hosts. This includes network configuration data as well as data about what is communicating with what, and how long it is taking. This data is stored in a Xangati database.
  • Netutive has just announced a new Performance Management Database (PMDB). The PMDB takes the performance information (response times and all of the related infrastructure metrics) that Netuitive collects from the hundreds of other monitoring products that Netuitive interfaces with and stores that information in one central performance oriented database. The PMDB also interfaces with popular CMDB’s. This allows analysis of performance data across servers, guests, resource pools, applications, regions, business units or services.

Conclusions

The CMDB’s that were designed and architected for static physical systems appear to be unwieldy, too difficult to keep up to date, and not real-time enough to make the transition into the virtualized and cloud based world.  Virtualized environments change too fast for existing CMDB’s to keep up, and the notion of keeping a CMDB up to date as assets are moved into and out of public clouds seems hopelessly beyond the intended original use case of a CMDB.

Given that traditional CMDB’s are now looking like a legacy technology, what can and should take its place? The starting point needs to be a new architecture that is based upon a data model that appropriately includes the correct elements of these new distributed and dynamic virtualized systems, with an ability to keep this data store up to date in near real time. It is also likely that the right answer will incorporate not just configuration data, but also true performance data that includes metrics like Infrastructure Response Time and Applications Response Time. In other words, the CMDB needs to make a transition from being configuration focused to being focused upon how configuration impacts the delivery of virtualized and cloud based Business Services. This new datastore will therefore need to tie directly into existing Infrastructure Performance efforts, Applications Performance efforts and next generation Service Assurance efforts. In the near term enterprises should look to new and creative vendors who are blending deep configuration management of virtualized and cloud based environments, with leading edge performance analysis capabilities like the ones mentioned above and in the posts linked to in the article. Long term a new standard definition for something like Netuitive’s PMDB may replace the currently archaic CMDB.