Many people seem to think that in this brave new world of converged infrastructure and software-defined everything, the era of standalone storage and networking is coming to an end. Indeed, it’s becoming quite popular to think differently about storage. There are new types of clustered and distributed storage options, like Ceph and Gluster, that rethink the way storage is delivered and built. There are virtual storage appliances (VSAs), like the HP StoreVirtual VSA and NexentaVSA, that essentially replicate standalone hardware in a virtual machine. There are also hybrid approaches, where companies like Nutanix, Scale Computing, and Simplivity deliver a clustered file system that’s tightly integrated, via virtual machine, with their products.
Articles Tagged with Gluster
The OpenStack Summit this week continued to fan the flames of the software-defined data center. The software-defined data center is just a term for replacing traditional data center hardware functionality with the same features implemented in software, running on commodity x86 servers. While software-defined approaches to data center features are at least nominally less expensive than their hardware counterparts the real promise in the approach is flexibility and management ease with high levels of integration. Reconfiguring a network to support the security requirements of a new application is now just a function of software and APIs. Expanding storage is just simply adding another node with more storage attached, and the cluster compensates automatically. Even things like firewall rules and load balancer configurations can now be stored as templates along with the applications, to be provisioned in minutes.
One of the differentiating features of an IaaS cloud implementation, is that you do not get access to a consolidated scalable storage infrastructure. At least not in the same way that you might expect if you were just scaling out compute nodes attached to the same SAN. You get remote block storage (Elastic Block Storage, EBS, in the case of Amazon) connected to a specific machine image, and you get REST-style object storage (Simple Storage Service, S3, in the case of Amazon) which is shared amongst images but does n0t speak the traditional APIs.
A lot of people have become dependent on EBS as it seems closest to what they are used to. Amazon failed because of simultaneous failure of its EBS in two Availability zones. If you were dependent on one of these (or mirrored across the two) you lost access to the filesystem from your Instances. It is also worth noting that EBS images are not like CIFS or NFS filesystems in that they can only be attached from a single instance, so you are still left with a bunch of headaches if you have a replicated mid-tier that expects to see a filesystem (for example to retrieve unstructured data). It may be sensible to move to the use of the S3 mechanism (or some portable abstraction over it) for new applications, but if you have an existing application that expects to see a filesystem in the traditional way, this will require you to rewrite your code, so you are left looking for a distributed cloud-agnostic shared filesystem with multi-way replication (including asynchronous replication), and this is where Gluster fits in.