Splunk Launches Beta of Hunk:Splunk Analytics for Hadoop

Right on the heels of VMware’s announcement of Log Insight, comes Splunk with their announcement of the beta of Hunk: Splunk Analytics for Hadoop. This is a hugely significant development for both Splunk and the big data analytics industry, as it allows customers to use the Splunk indexing, searching and visualization features on top of Hadoop data stores.

Hunk: Splunk Analytics for Hadoop Overview

Combining Splunk visualization, querying, and indexing with Hadoop, provides for some ground-breaking advances in big data analytics. The most significant of these is that in the existing Splunk product, Splunk can search against anything that a Splunk Indexer indexed and put in the data store. However, with Hunk, you can use the Splunk interface and query language to analyze all of the data in your Hadoop data store irrespective of how that data got there and when it got there. This is a critical point. If you already have a bunch of data in a Hadoop cluster and you then install Hunk, you do not have to index that data first – Hunk can immediately search and query against it.

You can read the Hunk: Spunk Analytics for Hadoop press release here.

A short video on Hunk is here.

The Hunk Product Page is here.

The Hunk Product Data Sheet is here.

A great blog on the new Splunk Virtual Indexes is here.

Hunk: Splunk Analytics for Hadoop and the Software Defined Data Center

In “The Big Data Back End for the SDDC“, we proposed a reference architecture for the management stack for the cloud and the software defined data center per the diagram below. One of the central tenants of this new architecture was that all of the data from all of the management products that managed a certain function or layer in the SDDC or the cloud would be put in one central data store. This is essential for the SDDC since the SDDC will be generating so much activity that only a centralized big data store will be able to handle the resulting streams of management data. It is also essential since data collected by one product will need to be accessible by another product. For example, if the APM product detects an increase in Application Response Time, it will need access to the security logs to see if an attempted intrusion is in fact causing resource contention which is then the root cause of the response time slowdown. This kind if cross-product correlation is impossible in the fragmented management stack that exists in the enterprise today, and will be essential to the smooth operation of the software defined data center.


Analysis and Implications

With the announcement of this beta, and presumably the delivery of the actual product sometime later, Splunk is positioning itself as a leader in the new management software business. This has the following implications for Splunk and the rest of the management software industry:

  • With VMware having announced Log Insight, and now Splunk having opened its value to data stored in Hadoop both companies have demonstrated a commitment to the idea that big data, searching of that big data, querying of that big data, and visualization of that big data are foundational capabilities for a modern management software vendor. In other words if you wan to be the anchor tenant in the mall of the modern management stack you had better have these capabilities. 
  • Both companies also intend to own the Operations Management layer as it pertains to virtualization and the Software Defined Data Center. VMware has vCenter Operations, and is already demonstrating bi-directional integration between vCenter Operations and Log Insight. Splunk has the Splunk App for VMware which uses the Splunk back end as its data store. So VMware views Log Management as a feature of Operations Management, and Splunk views Operations Management as a feature of its Operational Intelligence strategy.
  • If you look at the reference architecture above, you see a lot of pieces where neither Splunk nor VMware have best of breed solutions. Therefore for both Splunk and VMware, partnerships with adjacent vendors who will be willing to use the respective data stores of Splunk and VMware are essential to the long term success of their strategies. This is an area where Splunk has a huge advantage over VMware since Splunk has been partnering with adjacent third party vendors for quite some time, and VMware has not even gotten started on this front yet.
  • In the intermediate term this will boil down to a war between VMware and Splunk, and most importantly their respective ecosystems. This will be a very good thing for customers as there will be concerted competition between VMware and Splunk on multiple fronts, and a great deal of choice with respect to third party solutions that plug into these new management platforms.
  • Since VMware and Splunk are positioning themselves as leaders in the new management software business, we have to ask at the expense of whom? The answer to that question would be the incumbent legacy management solutions that cannot be adapted to virtualization, the software defined data, and the cloud. That would be the legacy management software businesses of IBM, BMC, CA and HP. Of these four vendors only HP is showing signs that it is rebuilding its stack for these new requirements with an entirely new set of internally developed products.
  • For customers these developments should be the catalyst to start evaluating and buying management software for their virtualized environments in a very different way than it has been done in the past. The enterprise ELA is the wrong way to go. Purchasing a bunch of tactical point solutions over the phone is the wrong way to go. Every management solution is a piece in a larger puzzle. Enterprises need to decide how they are going to build their new management stacks, and then decide how the related pieces fit together.


VMware with Log Insights and Splunk with the Splunk Analytics for Hadoop have positioned themselves as leaders for the management platform that will be the foundation of the stack of products that will be used to manage the software defined data center and the cloud. This will kick off a death spiral in the businesses of the legacy management software businesses, and ultimately create a new ecosystem of publicly traded management software vendors.