In 1980 I was working for Datapoint, a vendor with proprietary client hardware, proprietary server hardware, a proprietary LAN, and proprietary systems software. In 1983 IBM introduced the PC, and in 1985 it introduced the PC-XT with a hard disk. 3Com introduced Ethernet, and Novell created a network operating system. All of a sudden, Datapoint was on the wrong side of history in the computer business. In five short years, Datapoint went from six thousand employees to sixty.
Articles Tagged with Hadoop
Microsoft is in the news for finally ending its extended, months-long preview period for HDInsight and rolling out the welcome mat for big data workloads in the Microsoft Windows Azure cloud computing platform.
At the recent Misti Big Data Security conference many forms of securing big data were discussed from encrypting the entire big data pool to just encrypting the critical bits of data within the pool. On several of the talks there was general discussion on securing Hadoop as well as access to the pool of data. These security measures include RBAC, encryption of data in motion between hadoop nodes as well as tokenization or encryption on ingest of data. What was missing was greater control of who can access specific data once that data was in the pool. How could role based access controls by datum be put into effect? Is such protection too expensive given the time critical nature of analytics or are there other ways to implement datum security?
We are seeing more and more cloud-based big data solutions for security, business analysis, application performance management, and many other things we see the results of every day, from when we search on Google, Bing, etc., to the email we get from various marketing campaigns. We know that governments and many others are using big data, whether in a cloud form or on-premise form, to correlate various forms of data to determine who we are, where we going, what we are doing, how we are doing something, and sometimes why we are doing anything. So with all this data out there in the hands of ‘others’, how can privacy be achieved for the individual? We touched on this within the Internet of Things: Expectation of Privacy article, and within this we spoke about the handling of personal and identifiable information (PII).
At EMCworld 2013, one of the big stories was Pivotal and it’s importance to the EMC2 family and the future of computing. Pivotal is geared to provide the next generation of computing. According to EMC2 have gone past the Client-Server style to a scale-out, scale-up, big data, fast data Internet of Things form of computing. The real question however, is how can we move traditional business critical applications to this new model, or should we? Is there migration path one can take?
We recently had a conversation with DataStax regarding their DataStax Enterprise product, which got us to thinking a little about the nature of Big Data and Cloud. DataStax is the company behind the Open Source Cassandra NoSQL database. It provides technical direction and the majority of committers to the Apache Cassandra project. Cassandra in turn is a Column Family-based database along the lines of Google’s BigTable. If you are a SQL programmer it’s determining feature is… it doesn’t do joins.