We’ve been following Eucalyptus over a series of posts, and recently seen the company strengthen its management team with the appointment of new CEO Marten Mickos the (only) ex-CEO of MySQL. This week they have released a new version of the Eucalyptus product, Version 2.0. which carries some of his strategy, particularly in putting clear water between the Open Source and the Enterprise version of the product.
It strikes us that the Eucalyptus story of Public/Private cloud interoperability is strongly related to the dynamics around Public Cloud interoperability, and there are lessons to be learned from the momentum that Eucalyptus has gathered, for the wider debate around Infrastructure as a Service (IaaS) API standards.
Eucalyptus 2 New and Noteworthy
The basic premise with Eucalyptus is that is that a customer treats its data center as an extension of the Amazon Cloud (or perhaps the Amazon Cloud as an extension of its data center), using the same tooling for both and migrating from one to the other at will (cloud-bursting).
Eucalyptus 2.0 contains a number of new features to allow organizations to pursue a hybrid private/public cloud strategy. Eucalyptus is dual-licensed with GPL and a commercial “enterprise” version featuring, for example, VMware hypervisor and SAN support. Eucalyptus 2 offers two significant new features in the Enterprise version – first it now supports certain versions of Windows (2003, 2008 Server and Windows 7) as guests, and second it has its own authentication layer which controls access to instances within the cloud.
When we previously spoke to Eucalyptus they said they weren’t going to do authentication, so it is a bit of a change of direction, but it is inherently right; user authentication is a cloud service as well as a guest or application service. Their authentication layer is LDAP-based so in principle you could glue it into an existing enterprise directory although it would require a certain amount of coding. Also, the authentication layer doesn’t propagate into the public cloud when you cloudburst, you would also have to deal with this yourself.
Eucalyptus and Cloud API Standards
To achieve interoperability, Eucalypus implements Amazon EC2 APIs. This can be thought of as a pragmatic solution in the absence of a standard API that can be used for multiple IaaS clouds, but it limits cloud-bursting to EC2, rather than any other public cloud.
One of the more interesting aspects of the conversations we had with Eucalyptus was that they insisted that they are not wedded to the EC2 API, they chose it because of its status as a de facto standard API, and whilst they don’t currently support other APIs they could move quickly to do so.
Cloud API differences
Whilst VMware’s DMTF standards initiative for Cloud APIs goes on behind closed doors, the big news in March was a (by all accounts) vociferous debate at CloudConnect at which (and subsequent to which) it seems no progress was made. Some of the heat in this debate is about the style of API in use. It is worth doing a comparison between, say, the very different APIs offered by Amazon EC2 and Rackspace. We will focus on the control APIs, not the APIs that access storage (although there are differences there too).
EC2 offers two APIs, a SOAP API and a query API carried directly across HTTP. This feels like a programming language-level API that has been mapped down into HTTP and XML, and the result is actually quite ugly. The Rackspace API is REST-based and carries either XML or the more fashionable (and lightweight) JSON as a payload.
For those unfamiliar with REST, it offers a simple HTTP POST, GET, PUT and DELETE interface. It can be used to access a filesystem in the model of Amazon S3 API, but it also makes sense to use these urls to describe resources (such as machines, disks etc.) in a pseudo-directory structure. For example, the IP public IP address of a server is accessible at “http://<base management URL of my cloud>/servers/<server id>/ips/public”. To manipulate it you can use POST, PUT, DELETE, GET etc.
All the more recent cloud control APIs have gone REST, and although there are differences in the way they do authentication, when you look at it there isn’t really that much to standardize, only the sequence of words that make up the REST paths. It is not beyond the wit of humanity to have a few standard words like “server”, “IP”, “public” and an extension mechanism to add more. There should be enough common ground to get the basics standardized and then very simple to make tooling that can deal with the variability across environments. Except of course that we would then be left with the dominant player, EC2 which looks completely different.
Understanding the role of Eucalyptus
The discrepancy over API Structure is just about elegance and technical egos and it could probably be fixed, but one issue that seems to have emerged at CloudConnect is the difficulty of driving cloud standards from a software perspective. By this we mean that the main actors are the IaaS service providers rather than their software suppliers (be they commercial licensors or open source communities). So, whilst we at the virualization practice naturally think of this as software, the alternative view is that IaaS is simply a more flexible (and higher margin) alternative to traditional co-location or managed hosting. These companies typically have very high levels of customer loyalty, and to the extent that standards would encourage customers to change suppliers they are unlikely to promote standardization.
So standardizing these APIs is probably barking up the wrong horse, because the IaaS providers will continue to be motivated to differentiate their offerings and thus any standards that emerge will be loose at best. What we should be thinking of is the motivations of those who care about interoperability at this layer and what they are likely to deliver, and Eucalyptus falls into this category.
We start to consider whether IaaS interoperability is a tools issue rather than a standards issue, and Eucalyptus can be considered such a tool rather than a platform. In other words there comes a point when you pick up Eucalyptus not because it is an Internal Cloud that is exactly the same as EC2, but because (once it starts to implement multiple Cloud APIs) it contains tooling that lets you bridge Amazon and Rackspace and others, a bit like the way you can use a database administration tool to abstract over the bits of SQL that were never really standardized.