Will Scale Out Architectures Revolutionize Virtualization and the Cloud?

If you are thinking of your future data center what design point or goal is top of mind? Are you thinking about “virtualizing everything” around current and future versions of VMware vSphere? Or, are you instead thinking about what it would be like to have your own instance of Amazon EC2 or Google AppEngine?

There are significant differences in the objectives, benefits, and tradeoffs that lie between these two choices. In the case of VMware vSphere it is important to look at it first and foremost as a technology that is designed to bring forward a great deal of your existing operating systems and applications (with no changes required to either) into a more flexible, agile, dynamic, shared, cost-effective and modern world. The major benefit of vSphere is that it can deliver cost reductions and dramatic improvements to both technical and business agility while retaining backward compatibility with so much of what you already own.

Recognizing that vSphere is the clear technical and market leader for enterprise data center virtualization does not however mean that the fundamental approach that vSphere and the applications that run on it take bring with it some limitations. These limitations are:

  1. vSphere relies upon shared storage, particularly a SAN in order for some its most important and beneficial features to work. vMotion, Storage vMotion and the stack of features of benefits that rely upon these two core features are dependent upon the existence of a SAN.
  2. There are configuration limitations throughout the vSphere system. These include limitations on the number of LUN’s per host, and HBA’s per host, limitations on the number of virtual machines per volume of storage, limitations on the number of files per volume, limitations on the number of virtual network ports per vCenter, resource pool maximums, and limits on the scale of the environment that can be managed by a tier of vCenters.
  3. vCenter, and the other management products that operate in the vSphere environment (be they vCloud Director, CapacityIQ and other VMware products, or products from the third party ecosystem) usually rely upon relational databases as their back end data stores. Relational databases are not horizontally scalable (you cannot just add database servers infinitely as the data set and transaction load grows), and therefore their use represents an important underlying limitation in any environment that uses them.

While most of these limitations will never come into play for most data center environments and given that VMware and its competitors will continue to expand these limits, there are still important differences between what one can (and cannot) accomplish in a very large vSphere environment and what Amazon.com has done with AWS and what Google has done with AppEngine. There are also internal IT environments (for example Facebook) that have been designed around the principles of infinite horizontal scalability.

So what would such an infinitely scalable architecture look like? You cannot go buy the software stacks that Amazon and Google run on (you certainly can rent them) so if you want one of these of your own what approach would you take? The first step is that you need a data storage paradigm that is not based upon SQL, and that provides both automatic redundancy and horizontal scalability. Once such data store is Cassandra, an Apache Open Source project.  According to the Cassandra Wiki, Cassandra is “a highly scalable, eventually consistent, distributed, structured key-value store. Cassandra brings together the distributed systems technologies from Dynamo and the data model from Google’s BigTable. Like Dynamo, Cassandra is eventually consistent. Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems. Cassandra was open sourced by Facebook in 2008, where it was designed by of the authors of Amazon’s Dynamo and and a Facebook engineer. In a lot of ways you can think of Cassandra as Dynamo 2.0 or a marriage of Dynamo and BigTable. Cassandra is in production use at Facebook but is still under heavy development”.

Moving away from SQL carries with it the first big impact of going to an infinitely scalable architecture. Backward compatibility with every application that uses SQL Server, Oracle, MySQL, etc. will get broken by this move. That means that only new applications that are written to Cassandra instead of a RDBMS will work on this new infrastructure (assuming you choose something like Cassandra as the data store).

The second key architectural assumption is that while this new environment will rely upon a hypervisor, it will not rely upon shared storage in the same way that vSphere relies upon a SAN. If you noticed there is no concept of vMotion-like functionality in the Amazon and Google clouds. Amazon AWS relies upon a Xen derived hypervisor, but uses scaled out commodity storage instead of a SAN. This means that all of the functionality that one gets from vMotion is not present in this new architecture, and to the extent to which it is necessary needs to be implemented at other layers in the system. Particularly the redundancy and ability to fail over needs to be implemented either in the applications framework (the Java applications server) or in the applications themselves.

Some Vendors to Look At

Right now the single most commercially viable way to procure an infinitely scalable and elastic environment is to rent one from Amazon.com or Google. However there are many vendors and products that are targeting the ideal of letting you build one of these yourselves:

  • As mentioned above, Cassandra is perhaps a key and unique building block to an infinitely scalable architecture. Cassandra is able to write data across multiple physical data stores with incredibly short write times (it is really unique in the fact that writes are faster than reads). Cassandra and products like it may be the key to this re-invention of computing architectures what they offer is needed not just by infinitely scalable applications, but also the management products (think write heavy monitoring applications) that will be required to monitor these environments.
  • Since a hypervisor is required there is obviously a role for ESXi, Xen, and KVM. The issue here is really one of how these products are licensed. VMware is really not all that interested in licensing just ESXi and structures its product packaging and licensing so that you really end up needing to buy an edition of its virtualization platform – VMware vSphere. The most effective hypervisor for this use case is probably either the open source or Citrix sourced version of Xen.
  • If vSphere is not the layer of software that provides basic management of the environment and scale out elasticity then someone else will have to be. The currently leading candidates are Eucalyptus and Nimbula. Both companies are building and offering virtualization and cloud management platforms designed for companies that really want “their own EC2”.
  • Implementing “Desktops as a Service” (DaaS) in a truly scalable manner may in fact require this approach. This concept was addressed in an earlier post “The Grid Approach to Desktop Virtualization”. Getting rid of the requirement for a back end SAN both dramatically increases the scalability of the system and reduces its cost.
  • Application Virtualization technologies may play a critical role in bringing some of the benefits of this approach to existing and legacy Windows applications. In particular Microsoft’s efforts on the Server App-V front may demonstrate that separating the application from the underlying OS will be at least as valuable, and perhaps more valuable than was separating the OS from its underlying hardware.
  • It is critical not to overlook the role of the applications framework in this architecture. VMware is clearly thinking along these lines in terms of how it evolves vFabric. Notice that vFabric is not tied to vSphere in any kind of proprietary or architectural manner. Yes there will be an ever growing set of benefits to running vFabric applications on vSphere and on vCloud – but it is just as important that VMware has reached out to SalesForce.com (VMForce) and Google to make vFabric into the Java runtime layer for these public clouds. What this means to you as a customer is that if you build your application to vFabric you get the best of all worlds. You can run it on vSphere, you can run it on an alternative scale out infrastructure in your own data center, and you can run it in two of the market leading public clouds.


Given that vSphere provides significant benefits in terms of cost savings and business agility, those benefits are tied to and constrained by the ability of vSphere to provide backward compatibility with existing legacy enterprise systems. This backward compatibility makes it impossible for vSphere to provide infinite horizontal scalability.  Moving to the same architecture as the most highly scaled out public cloud vendors provides for a more radical set of benefits, but at the cost of breaking backward compatibility for many applications.

Posted in End User Computing, IT as a Service, SDDC & Hybrid CloudTagged , , , , ,