In June, I was in Boston for Virtualization Field Day 5, which was an amazing event. The sponsor presentations are usually awesome. The next best thing about Tech Field Day events is the conversations that you have with other delegates between the presentations. On one trip, Stephen Foskett wondered why none of the hyperconverged vendors has converged networking. All of the hyperconverged vendors use physical Ethernet switches. I spent the next half hour talking with Chris Marget about what the requirements might be and what networking technology might be used.
As we know, hyperconverged places storage and compute into each node, then builds both storage and compute clusters by joining nodes. The VMs that these solutions are designed to deliver consume four food groups of resources: CPU, RAM, disk, and network. Current hyperconverged solutions converge the first three. They rely on external devices to deliver the fourth. While it’s not mandatory, it is usual for the storage network to be a couple of 10 GB Ethernet switches.
The questions that Chris and I explored are: “How would you converge the network into the hyperconverged nodes? In particular, could you build a network between the nodes that would be used for all of the storage traffic between the nodes?” This is the largest network load in hyperconverged, replicating written data and fetching non-local reads. An existing network for VM and management traffic would be OK. The hyperconverged cluster traffic should be separated. I would consider the cluster network to be converged if the cables just went from node to node—no external switches or concentrators. The most obvious way to do this is with a mesh network. Each node would connect directly to every other node. The problem is that to mesh the network, you would need a cable from each node to every other node. Not a big problem with four nodes (six cables), but a real issue with a 64-node cluster (2,016 cables).
The other alternative is a loop cabling architecture. Each node connects to the next, and the last node connects back to the first to form a loop. I would go with a dual loop design for redundancy and so that expanding one loop doesn’t cause the whole network to stop. I’m also thinking more of FDDI loops than of token ring loops—node to node, without central switches.
This is where it got very interesting: Chris told me about Plexxi. Plexxi has interesting networking technology that uses passive optical interconnects. It allows every node to be one hop from every other node, without a sea of cables. Given that the interconnects are passive, I would guess that node to node cabling is probably also possible. What if this technology were integrated into the hyperconverged nodes in place of 10 GbE? Your hyperconverged cluster wouldn’t need 10 GbE switches, just optical cables looping. It does look like you could attach a router to the Plexxi network to get VM and management traffic, removing the need for 1 GbE to each node. Even more interesting is Plexxi’s technology to deliver the same layer 2 networks over a 100 km fiber to a remote data centre. This would be an immediate enabler for a metro cluster environment: a hyperconverged platform with an inherent ability to provide active/active clustering across data centres; live VM migration between data centres and HA between them.
Does it make sense to further converge the network within a hyperconverged solution? I suspect not. While 10 GbE switch ports are not cheap, they aren’t a huge part of your hyperconverged infrastructure cost. The network is still converged compared to using Fibre Channel for storage and Ethernet for VMs and management. For customers who want to minimize cabling, a hyperconverged node can work with three network cables (two 10 GbE plus 100 MbE for out-of-band management) and two power cables. Reducing the cables to two optical plus power isn’t a huge savings. Interesting discussions are educational, but not every one leads to a breakthrough.