The Virtualization Practice

Distributed Virtual Switch Failures: Failing-Safe

In my virtual environment recently, I experienced two major failures. The first was with VMware vNetwork Distributed Switch and the second was related to the use of a VMware vShield. Both led to catastrophic failures, that could have easily been avoided if these two subsystems failed-safe instead of failing-closed. VMware vSphere is all about availability, but when critical systems fail like these, not even VMware HA can assist in recovery. You have to fix the problems yourself and usually by hand. Now after, the problem has been solved, and should not recur again, I began to wonder how I missed this and this led me to the total lack of information on how these subsystems actually work. So without further todo, here is how they work and what I consider to be the definition for fail-safe.

I have been a big fan of VMware’s Distributed Resource Scheduler (DRS). VMware DRS is a service or feature that will dynamically allocate and balance computing resources across the hosts in a cluster. In all of the environments I have work with so far, DRS has been a fantastic tool for getting and maintaining that balance across all the hosts in a cluster. Recently though I have come across a limitation of VMware’s DRS that is worth mentioning.

MultiPoint Server is the Cinderella of the Windows world, locked away in the cellar education sector, kept away from the bright lights of publicity and severely limited in what it could offer. But that could well be changing given Microsoft’s recent efforts to revamp the product. Although not yet quite ready for shipping, Microsoft has been working hard to add value to MultiPoint Server and when it ships in March it looks like Microsoft might have a winner on its hands.

Todd Nielsen has already succeeded twice at what he is now being asked to do at VMware – once at Microsoft and once at BEA. This time what hangs in the wind is VMware’s ultimate destiny. Will VMware be the device driver to the dynamic data center (vSphere), or will VMware be that and the next generation application platform for IT as a Service and Public Cloud based applications?

Application Virtualization allows users to use potentially conflicting software in the same workspace. Towards the end of 2010 there was a great deal of discussion about the complexity of using application virtualization to finally let corporations end their dealings with the recalcitrant Internet Explorer 6.

In Virtualizing Internet Explorer: Microsoft takes the ball home and goes home we discussed why solving IE6 issues with Application Virtualization is difficult. Then, in December we reported that Browisum had crafted a lifeline and suggested a release date around the end of 2010.

To quote Robert Burns “The best-laid schemes o’ mice an’ men Gang aft agley”. Still, Browsium have announced the release candidate to their beta testers. With its release is it time to put IE8 compatibility issues to bed?

At last year’s VMworld in San Francisco Stephen Deasy (Director, R&D, VMware) and Srinivas Krishnamurti (Senior Director, Mobile Solutions, VMware) announced VMware’s plans for a type II mobile hypervisor platform. Three months later VMware and LG have announced a partnership to install VMware Mobile Virtualization Platform (MVP) on LG smart phones starting in 2011. While significant questions remain about the viability of this partnership, the need for a mobile virtualization solution cannot be stressed enough.

On the second Virtualization Security Podcast of 2011, we had Doug Hazelman of Veeam as our guest panelist to discuss backup security. Since most of backup security relies on the underlying storage security, we did not discuss this aspect very much other than to state that the state of the art is still to encrypt data at rest and in motion. What we did discuss is how to determine where your data has been within the virtual or cloud environment. This all important fact is important if you need to know what disks or devices touched your data. An auditing requirement for high security locations. So we can take from this podcast several GRC and Confidentiality, Integrity, and Availability elements