Software Defined Data Center Cloud Management

SDDC.Management.Stack.Reference.ArchitectureThe entire purpose of constructing an Software Defined Data Center is to allow new data center services to be rapidly provisioned in response to business demands. But the business does not just want a data center service. The business wants and needs either a full development environment in support of custom application deployment, or a full business application delivered as a service. Cloud Management is the crucial layer of software that adds application level services to SDDC services to create solutions for the business.

Read More


SDDC Operations Management

SDDC.Management.Stack.Reference.ArchitectureIn, Building a Management Stack for Your Software Defined Data Center, we proposed a reference architecture for how one might assemble the suite of management components that will be needed to manage a Software Defined Data Center (SDDC). In this post we take a look at the Operations Management portion of the reference architecture and the vendors that can provide this functionality.

The Need for New Operations Management Vendor in Your SDDC Management Stack

So why you ask, will a SDDC require a new approach to Operations Management and therefore more than likely a new vendor for Operations Management? The reasons are driven by the fact that managing the operations of an SDDC will be dramatically different from managing a static and physical data center in the following respects:

  1. Legacy Operations Management products were built to the assumptions of servers dedicated to single applications, networks implemented solely in hardware, and usually a dedicated path from the database server to the storage array. A SDDC is based upon shared servers, networks implemented in both hardware and software, and potentially a shared and multiplexed path to the storage array.
  2. Legacy Operations Management solutions were built to assume systems that changed relatively infrequently.  The SDDC is built to support private clouds and IT as a Service. The whole point of both private clouds and IT as a Service are to fully automate the process by which IT services are provisioned for end users. This means that the configuration and resource allocation in an SDDC will change whenever users want it to, since users will be provisioning workloads whenever they need to.
  3. For the above two reasons, you cannot just add VMware vSphere as a data source to a legacy Operations Management solution and expect to have something useful. Operations Management for a SDDC means getting different data, getting more of it, getting it more frequently, and doing different things with it than were done in the legacy physical case.
  4. For example the whole notion of resource contention caused by N workloads running on one server simply does not exist in the physical world. Neither does the notion that new workloads are going to show up on a server in an automated matter at the discretion of a business constituent of the IT department.
  5. The SDDC is going to be concerned with the configuration and operation of all of the CPU, memory, networking and storage resources underlying the SDDC. In the legacy world there were completely separate products for managing servers, for managing switches, and for managing storage arrays.  The management of all four of these key resources will need to be combined into one Operations Management solution for the SDDC.

The Software Defined Data Center Management Stack Reference Architecture


Key Criteria for an SDDC Operations Management Solution

Since there are many vendors selling Operations Management products into the VMware and Hyper-V virtualization markets today, the most important thing to do is to evaluate these vendors on their future ability to expand their product scopes to include support for the SDDC. That would include the following key capabilities:

  1. Just about every Operations Management vendor supports more than one hypervisor today. At this point support for at least VMware vSphere and Microsoft Hyper-V ought to be assumed as table stakes. Even if you are a 100% VMware vSphere shop today, you should at least get a statement of commitment for support of Microsoft Hyper-V and Red Hat KVM from your Operations Management vendor. This is because there is nothing wrong with having more than one hypervisor. However building a management stack as depicted above that is different for each of two or three hypervisors would re-create the management mess that characterizes Operations Management in the physical world for most enterprises.
  2. The ability to handle the scale and scope of your environment. This requirements produces drastically different results depending upon the size of your environment, the diversity of the hardware in your environment, and the nature of the workloads in your environment. At the low end (100 physical hosts), the idea is to end up with one simple to implement product that collects data from the standard management interfaces available at each layer of the SDDC and does appropriate analysis and presentation of that data. At the high end (5,000 to 10,000 hosts) commodity data is going to equal commodity results. You will want to invest in an Operations Management solution from a vendor that understands and has the ability collect unique and valuable data with their own R&D efforts.
  3. Today’s Operations Management solutions focus primarily upon the management of physical and virtual servers. Little attention is paid to the virtual network that exists today in the form of the vSwitch and they only attention that most vendors pay to storage is to consume the storage metrics that VMware makes available in the vSphere API. This will have to dramatically change. Managing the virtual network layer and the virtual storage layer will be much more demanding for Operations Management vendors than managing CPU and memory contention.
  4. Today, relatively few of VMware’s customers have fully implemented private cloud or IT as a Service environments. The point of the SDDC is to support the creation of these environments. So Operations Management solutions are going to have to significantly change to provide the level of management needed for large scale and dynamic systems.
  5. The combination of having to manage CPU, memory, networking and storage, with having to manage a large scale environment, with being able to cope with the constant changes driving by the automation in private clouds supporting IT as a Service will require different Operations Management solutions than those that we have today.

Who Could Provide Operations Management for the SDDC?

First let’s make a very important point. Since the SDDC does not exist yet, no one has an Operations Management product for an SDDC today. We have wait for VMware to deliver upon the recently announced NSX network virtualization components, and deliver on the rumored but not yet announced storage virtualization projects. Given how things are unfolding, and have unfolded in the past, there are good reasons to hope that further announcements and delivery dates will be provided at VMworld this fall.

Given that no Operations Management product for an SDDC exists today, what we are left with is the ability to engage in informed speculation as to who might deliver such a Operations Management solution. Note that this is 100% speculation based upon an analysis of each vendor’s strategy in the Operations Management space today.


VMware is a leader in the Operations Management business for virtualized data centers today with its vCenter Operations product. Since VMware is the only vendor on the planet who has announced the intention to build and deliver an SDDC, it is a reasonable assumption that VMware will evolve vCenter Operations to be able to manage its own SDDC. Despite the fact that it seems obvious that VMware would go down this path, there are tremendous challenges for VMware as it expands the scope of vCenter Operations in this manner. Some of these challenges were outlined in our Big Data for for the SDDC post. Basically VMware has to start by ripping the existing data store out of vCenter Operations and replacing it with something most likely built by the Log Insights team from Pattern Insights. Next VMware has to add the relevant metrics at the relevant level of frequency for the virtual networking and virtual storage layers. This is going to require the new big data back end since there will be so many new metrics arriving at such a rate that the existing data store would have no change of keeping up. Finally, the analytics in vCenter Operations will have to go through a significant evolution to deal with this new torrent of data and to be able to provide effective cross-domain root cause analysis. VMware likely understands each of these challenges very well. However VMware is unlikely to address all of them across the diversity of its own customer base, leaving plenty of room for third party vendors.


If you are looking to throw our your legacy physical Operations Management solution and replace it with something that is built from the ground up for the SDDC and the private cloud and IT as a Services use cases for the SDDC, then Zenoss would be a good place to start. Operations Management starts with the ability to manage events and mange the impact of events upon the availability of the physical and virtual environment. Zenoss has a completely modern event management system and if your environment is of the scale and diversity that event management is needed then Zenoss is a great place to start.


One of the key points behind building and using and SDDC will be that it will be possible to automate many things that are not or cannot be automated today. VMTurbo uniquely solves the problem of fully automating the process by which the important workloads in your environment are assured that they get the resources that they need to meet their SLA’s. VMTurbo does this by allowing you to prioritize your workloads, and then by using the virtual CPU, virtual memory, network I/O control and storage I/O control interfaces in vSphere to ensure that the highest priority workloads get the resources that they need. This is precisely the kind of approach that will be essential to the smooth operation of the SDDC as there will be no way for humans to keep up with resource allocation decisions as private clouds and IT as a Service gets deployed in your SDDC.


Like VMTurbo, Cirba comes at the Operations Management problem for the SDDC with a heavy dose of analytics. However, the focus of Cirba is more upon making sure that the physical capacity of the infrastructure underlying the environment is properly utilized and allocated. This will prove to be an essential capability for the management of the SDDC as all of the automation in the world will end up being useless if the underlying physical capacity across the four key resources areas does not exist or is not properly allocated. Conversely, the tendency to over-provision in the name of reducing risk is likely to be just as strong for the SDDC as it has been historically for physical environments, making Cirba into something that is an essential cost management tool.

The Quest Software Division of Dell

When Quest Software bought vKernel, two market leading products were brought together under one roof. One was the vFoglight product from Quest. The other was the vOperations product from vKernel. These products have now been combined into the Quest vOPS product line. This product line is unique in that it retains the two key aspects of the parent products. On the low end the product is extremely easy to try, implement and purchase (a legacy of vKernel). At the high end (a legacy of vFoglight and the rest of the Foglight product line), the product is a fully enterprise capable solution that can be combined with numerous other Quest offerings to solve complex end-to-end and cross stack Operations Management issues.

Reflex Systems

Reflex Systems is unique in the Operations Management space in that the company long ago decided to architect its solution for very large environments and for large amounts of rapidly arriving data. Reflex Systems is one of the few Operations Management vendors that can collect the operations and configuration data in a VMware environment directly from each vSphere host every 15 seconds, as opposed to waiting for the 5 minute roll-up of that data from the vSphere API. The ability to do this for the largest of VMware’s customers, supported by the analytics required to analyse this data and a user interface capable of making sense of the quantity of the data and the scale of its source makes Reflex Systems into a unique Operations Management vendor today. The foundations upon which the Reflex Systems product are built positions the company extremely well for Operations Management of the forthcoming SDDC.


What if the right way to approach the problem of collecting the Operations Management data for an SDDC and then analyzing that data is to use the approach that Google took to collecting data and analyzing it for its own data center. If you are willing to consider that possibility, then consider CloudPhysics, a vendor with cloud hosted (delivered as a service) operations management solution. One of the key reasons that CloudPhysics may be able to provide something of extraordinary value is that the company has a strategy of applying Google quality analytics to Google size data sets. The analytics come from a world class team of people some of whom previously worked at Google. The data today is collected by virtual appliances installed at CloudPhysic’s customer sites (in their respective VMware environments). This puts CloudPhysics in the unique position of being able to do analytics across the operations management data from many customers, which will likely result in features and benefits simply not possible from on-premise solutions.


Splunk is in fact the only vendor on the planet from whom you can purchase an on-premise big data datastore, which is today being populated not just by various logs, but by virtue of the Splunk App for VMware and the Splunk Apps for Citrix true Operations Management data for these environments.  In fact if you go to SplunkBase do s search on “virtual” you will find 11 different operations management applications feeding data into Splunk. Splunk has a strategy of being the management data platform across operations management, application performance management and security and certainly bears watching as it evolves its strategy and product offerings in the direction of the SDDC.


If your current virtualization environment or your future SDDC spans more than one hypervisor, but your primary environment is VMware, then you really need to consider Hotlink. Holink offers something different than any other Operations Management solution profiled here. Hotlink lets your VMware administrators administer Hyper-V, KVM and Amazon EC2 environments from within the vCenter management console in the exact same manner as they manage a vSphere environment. This give rise to a new meaning for cross platform. In Hotlink’s world cross platform is not just that an Operations Management or Cloud Management solution works across two or more hypervisors. In Hotlink’s world cross-platform means that you can use one management console (vCenter) to manage all of these environments, migrate workloads across these environments, and leverage your vSphere management conventions (like snapshots) across all of these environments.


Count how many management agents of various types (operations management, application performance management, security, backup, etc.) you have deployed in each virtual machine in your environment. Now multiply that by the number of VM’s in your environment. If the thought of having to manage and update all of those agents (and preventing their misbehavior from affecting your environment) gives you a headache, then Intigua is for you. Intigua applies application virtualization techniques to the management agents in your virtualized server environment (think App-V for management agents on servers). This makes it much easier to manage the agents in your environment and allows you set policies that prevent those agents from harming your environment.

ManageEngine and SolarWinds

If all of the above sounds too complex and too expensive for you because your environment is just not that large and just not that complex then you need to focus upon solutions that just rely upon the management data available from standard management API’s (the vSphere API, WMI, SNMP, SMIS, etc.) and that are easy to evaluate, easy to implement, and easy and affordable to purchase. If you consider yourself to be an SMB or an SME then the products from ManageEngine and SolarWinds are for you. The objective here is how quickly do these products deliver value to you and how little manual configuration work do you have to do to get that value. Most importantly, how little on-going maintenance work are you going to have to do to keep your environment up and running.


The SDDC is going to require a new approach to Operations Management. Vendors with effective Operations Management solutions for today’s virtualized data centers are in the best position to be able to expand their offerings for the SDDC.   Legacy vendors face a complete rewrite of their products and the adoption of a new business model (easy to try and easy to buy) that will destroy them financially, and will therefore be unable to react to the SDDC either technically or financially.