On January 12 2011, NetApp announced that it is going to acquire Akorri. The significance of this can only be understood by first going through some background on Akorri.
Akorri was founded in 2005 by Rich Corley, a serial entrepreneur. Rich has a very deep background in storage and in particular in how storage performance impacts the performance of workloads that rely upon storage systems. At the time of the founding virtualization was not the overwhelming phenomena that it is today and the focus of Akorri was upon understanding how the intersection of storage performance and the performance of the rest of the infrastructure (servers and networks) impacted the performance of the overall system.
Akorri had a two very crucial insights very early in this process. The first insight was that while there were many monitoring solutions that collected garden variety data from public interfaces and tried to infer something about performance from this data, real insights about performance were only going to be possible with data that was not available to every monitoring vendor. This was especially true in the storage realm where many storage vendors (especially the leader in the space – EMC) is known for purposely doing a horrible job making any relevant information available about performance through any kind of a standard interface like SMIS.
Akorri therefore instrumented each array individually, and in many cases instrumented each version of the software for each array individually. This took a great deal of time and cost quite a bit of money, but as Akorri got support for a critical mass of storage arrays, put Akorri in a unique position of being able to deliver extraordinary results because it started the process with extraordinary data.
The second insight was even more crucial. Akorri recognized that while there was a relationship between resource contention and workload (application) performance in the infrastructure that supported that workload, it was not a straightforward relationship. Sometimes an overloaded resource would matter, sometimes it would not, and more often than not it was the combination of overloads in the infrastructure chain that caused the problem. This lead Akorri to conclude that surfacing a resource utilization metric as the indicator of performance was wrong, and that what was needed was a metric that measured the overall responsiveness of the system. This metric was Infrastructure Response Time and it was derived via a sophisticated queuing model developed by a team of statisticians that were among the first set of people that Rich Corley brought into the company.
It took virtualization for the wisdom of Akorri’s approach to become fully apparent. As virtualization took hold, it became abundantly clear that once static and dedicated systems became dynamic and virtualized, one could no longer infer the performance of a workload from resource utilization metrics and a new category of monitoring tools – the Infrastructure Performance Management category, lead by Akorri and categorized by the use of Infrastructure Response Time was born.
The significance of Akorri’s approach was not lost on other vendors. NetQos a network performance management vendor in Austin Texas, turned their physical appliance that collected TCP/IP performance from the mirror port on physical switches into a virtual appliance that did the same on the VMware vSwitch. NetQos was shortly thereafter (in the Fall of 2009) acquired by CA.
Virtual Instruments spun out of Finisar (the makers of the optical transceivers in SAN switches) and took an Infrastructure Response Time approach to SAN performance. VI created a metric, Exchange Completion Time, that measures the round trip performance of every transaction on the Fiber Channel SAN. Virtual Instruments has seen tremendous success with this approach in the largest of enterprises who see the value of using the SAN as a point of visibility into their storage systems and their performance.
Xangati took the approach of deeply instrumenting the IP network, starting with Netflow data and then adding more and more response time metrics allowing Xangati to show VMware Adminstrators where in their LAN and WAN the bottlenecks were that were impacting overall systems performance.
Akorri, CA/NeQos, Virtual Instruments, and Xangati therefore emerged to comprise the Infrastructure Performance Management category, and were profiled in this post last year.
The Future of Infrastructure Performance Management
While the IPM category has existed (at least it has existed here at The Virtualization Practice) since mid-2009, and while great progress has been made by the participants in the category, much work remains to be done. This is especially true since now that Akorri will be part of NetApp, bias for support of the NetApp storage products will creep into Akorri’s product decisions. This is very good news for the customers of NetApp who will now have the market leading IPM product focused on their storage environment. NetApp is also known for taking a vendor neutral approach in its existing storage management solutions in its SANscreen product so there is good reason to hope that Akorri’s products will not drop support for competing storage arrays and that in fact it will continue to be possible to use Akorri’s solutions to objectively compare the performance of multiple storage arrays at one customer site.
The fact is that even though much progress has been made, the real problem remains unsolved. Akorri was able to calculate IRT from the perspective of the workload on the server (physical or virtual) that communicated with the storage system. Therefore Akorri’s view of the world starts with the server that is talking to the HBA. Virtual Instruments understands the storage subsystem from the point of view of the SAN. Xangati understands the IP network end-to-end.
So the real problem that is left unsolved is a real-time, continuous, and deterministic end-to-end understanding of Infrastructure Response Time. Specifically this means that when a workload (any workload that is a part of any application) places a request for work upon the infrastructure, how long does it take for this infrastructure to respond. The four vendors that have been discussed in this article all addressed pieces of this question – but no one has yet answered it in its totality.
Addressing the question of end-to-end Infrastructure Response Time is one of the key challenges that will have to be met in order for virtualization to actually progress beyond its “low hanging fruit” stage in many enterprises and address business critical applications. The teams that own these applications are going to demand that the owners of these new virtual and dynamic infrastructures be able to measure and assure the performance of these infrastructures on behalf of these business critical applications. The IT Operations teams running these virtual infrastructures will need IPM tools if for no other reason that to be able defend themselves in blamestorming (something is wrong let’s assign the blame) meetings.
Infrastructure Performance Management and Public Clouds
If addressing this problem in enterprises is not enough of an opportunity, it is an even greater requirement and opportunity in public clouds. Security and performance assurance are the two top reasons why enterprises are reluctant to put business critical applications into public clouds. The performance concerns are not going to go away until cloud vendors can publish more than useless virtual resource consumption statistics to their customers. These requirements and where we stand meeting them were discussed in this post.
The acquisition of Akorri by NetApp demonstrates the importance of Infrastructure Performance Management solutions as virtualization progresses into the realm of business critical applications, and as public clouds hope to do the same. However rather than signalling a “game over” this acquisition really raises both the visibility and the importance of both the problems that Akorri solved, and the true end-to-end problems that remain.