In 1980 I was working for Datapoint, a vendor with proprietary client hardware, proprietary server hardware, a proprietary LAN, and proprietary systems software. In 1983 IBM introduced the PC, and in 1985 it introduced the PC-XT with a hard disk. 3Com introduced Ethernet, and Novell created a network operating system. All of a sudden, Datapoint was on the wrong side of history in the computer business. In five short years, Datapoint went from six thousand employees to sixty.
Is Shared Enterprise Storage on the Wrong Side of History?
To date, shared enterprise storage has been the beneficiary of two huge trends in the computer industry: data center virtualization (which basically needs shared storage for certain highly valuable features such as vMotion to work), and big data, which produces so much data that large-scale arrays are needed. Shared enterprise storage arrays also sport a variety of highly attractive enterprise-class features like redundancy (no single point of failure), deduplication (saving on disk space and cost), and the ability to move bits around as business needs dictate. These arrays also have a huge amount of hardware engineering behind them, designed to allow them to offer excellent performance at scale to their customers.
However, several recent trends have emerged that offer a competing vision for how storage should be managed in the enterprise. That competing vision is that storage should be combined with compute (again) into scale-out nodes. This has the advantage of leveraging cheap commodity hardware and offering infinite horizontal scalability. This is how Google built its cloud. This is great for Google, but as of yet, it has not made the transition into the way mainstream enterprise computing is done.
This is, however, all about to change, due to three somewhat related initiatives.
The Hadoop Scale-Out Architecture
Hadoop is based on the Hadoop File System (HFS). The original designers of HFS designed it to work on cheap commodity hardware with cheap commodity local disks. Whereas you certainly can run Hadoop on enterprise-class shared storage, most people don’t. An image depicting a Hadoop cluster is below. Notice the reliance on hard disks in the servers.
The VMware VSAN Scale-Out Architecture
VMware has just announced the general availability of VSAN. What VSAN does is pool the local hard disks in the servers into a shared array of storage that looks like network-attached storage (NAS) and supports redundancy across the cluster of nodes. This means that no bit of data is in only one place, so there is no single point of failure that can cause data to be lost. Again, notice the reliance on local flash and local direct-attached hard disks in the servers. You read our review on the initial release of VSAN here.
The Nutanix Scale-Out Architecture
Whenever anyone decides to start a brand-new hardware company around a new computing architecture, they are either crazy or onto something big. Well, some of the founders of Nutanix are from Google and have more than a passing acquaintance with how the Google File System works. So, Nutanix combines scale-out nodes that contain compute, networking, and storage with their own file system and then a hypervisor. Once more, note the presence of flash and direct-attached hard disks in each compute node.
Strategic Implications of These Scale-Out Architectures
Each of these products has the potential to disrupt the status quo in the computer industry in fundamental ways:
- VMware VSAN has the short-term potential to create a new price performance point in enterprise storage. VSAN has the long-term potential to completely reshape the data center compute and storage landscape, especially as storage vendors start to build disk drives suited to this new use case. If this sounds far-fetched, remember that Intel has modified its chips to make them better at the virtualization of compute. There is nothing that would stop the disk drive vendors from doing the same over time as VSAN and similar solutions take off.
- Nutanix has the short-term potential to accelerate data center virtualization by making the integration of disparate components less daunting for the customer. In this regard, think of Nutanix as the next generation of a Cisco UCS or a VCE Vblock. However, the long-term implications of Nutanix’s success are much more profound. If Nutanix succeeds in building a brand-new category of servers for the data center, then the table stakes for all server vendors will be the ownership of a credible distributed file system. Since credible distributed file systems do not grow on trees, this will put Cisco, IBM, HP, and Dell in a position of severe competitive disadvantage.
- Hadoop and HFS have an equally profound potential for impact. In the short term, Hadoop is going to destroy the proprietary data warehousing business. But this is a largely read-heavy and parallel query use case. Right now, Hadoop and HFS cannot be used to replace transactional RDBM systems like Oracle and Microsoft SQL Server. But someone is going to come along and create an open-source project that turns HFS into a transactional file system, and that creates API compatibility with Oracle, SQL Server, and MySQL. At that point, there will be no reason for customers to buy expensive database server specific hardware and expensive RDBMS software, as they will be able to replace it all with the transactional follow-on to Hadoop and HFS. This will destroy the enterprise database server business of Oracle, and since Oracle only has hostages and not customers, this will be a very good thing.
If the above comes to pass, the enterprise storage business will become disrupted, the data center server business will become disrupted, and the RDBMS business will become disrupted. Think about what VMware is doing to the networking business and combine it with the dynamics above. Vendors of proprietary hardware and software are going to come under severe pressure just as my first employer in this industry did. Most of us would consider this to be a good thing.
Google, VMware, Hadoop, and Nutanix have all voted at least in part for a scale-out architecture that does not rely on enterprise storage. VMware and Nutanix are important because they apply these concepts to traditional enterprise workloads. Hadoop represents the future of how data-intensive computing will be done. Together, these four companies represent a serious threat to enterprise shared storage.