Data locality is a feature of some hyperconverged infrastructure (HCI) products. I’m going to spend a couple of articles looking at the implications of having, or not having, this feature. Data locality means that the host running a particular VM should have a complete copy of that VM’s data. Without data locality, the VM data is spread over every host in the cluster. Data locality is not a universal feature, and it can have significant impacts on the scaling of an HCI deployment. In particular, as clusters scale, the storage network can become a bottleneck as the remote IO increases. Working out just how much network traffic took me a little mental gymnastics. The trick is to account for coincidental data locality. The result is that it becomes clear that data locality matters a great deal when you scale your HCI out.
How Much IO Is Remote?
One of the core elements of HCI is that VMs are spread across multiple hosts for redundancy. One result is that every VM disk write goes to at least two hosts. With data locality, only one write needs to cross the network; the other happens locally inside the host. Without it, both copies of the write might need to cross the network. Some solutions allow more than two copies of data, so even more data crosses the network. VM disk reads do not always cross the network. With data locality, the host running the VM has a full copy, so all reads are local. Without it, there is no guarantee that a read will be local; depending on the size of the cluster, there could be a lot of nonlocal data.
Coincidental Data Locality
This section only applies to systems without intentional data locality. If VM data is spread over a whole cluster without considering which host is running the VM, then each host will have a fraction of each VM’s data. Specifically, it will have the number of data copies divided by the number of cluster nodes. For a three-node cluster with two data copies, each node will have 2/3 of each VM. For a 32-node cluster with three data copies, each node will have 3/32 of each VM. This coincidental data locality reduces the amount of IO that is nonlocal. As you can imagine, the larger the cluster, the less the coincidental locality.
Just Do the Math
Sooner or later, this was always going to end up with math. Storage performance always ends up with math. I’m going to calculate out the percentage of VM IOs that need to go across the network for each node in a cluster. Here are the inputs:
- N = number of nodes in the cluster
- R = Number of data copies to be written
- W = Percentage of VM IOs that are writes
- P = Percentage of VM IOs that need to leave the host
With Data Locality
One write goes local. R – 1 writes aren’t local. All reads are local.
P = (W * ( R – 1)) + ((1 – W ) * 0)
Without Data Locality
(R / N) is the fraction of coincidentally local IO. Some writes are local. Some reads are local.
P = (W * (R – (R / N))) + ((1 – W) * (1 – (R / N)))
Two Data Copies
So, the math is a little hard. Let’s look at some results. First off, we will compare clusters with two data copies and 80% read IO. 80% read is a typical server virtualization number. Having two data copies is typical of smaller clusters and some HCI products that enforce data locality.
|20% Write IO, 2 Data Copies||Remote IOs|
|Node Count||Data Locality||Without Locality|
Notice that with data locality, there is no change in the proportion of non-local IO as the cluster grows. So, at large scale, data locality is critical to good HCI performance. However, without it, a large cluster can end up doing more IO across the network than the VM actually issues. This is because most writes are sent across the network to two different locations. Data locality helps prevent storage networks from becoming bottlenecks in large HCI clusters. For HCI without data locality, it is important to consider the effects of remote IO as the cluster grows. In my next article, I will look at environments where two data copies are not enough. I’ll also discuss the effects of vMotion/Live Migration on data locality and the effect of IO size on the storage network.