The Virtualization Practice

Tag Archive for Hadoop


At the recent Misti Big Data Security conference many forms of securing big data were discussed from encrypting the entire big data pool to just encrypting the critical bits of data within the pool. On several of the talks there was general discussion on securing Hadoop as well as access to the pool of data. These security measures include RBAC, encryption of data in motion between hadoop nodes as well as tokenization or encryption on ingest of data. What was missing was greater control of who can access specific data once that data was in the pool. How could role based access controls by datum be put into effect? Why would such advanced security be necessary?


As we look at privacy of big data within any cloud, on premise, or mixed, we need to realize that the amount of data could be so large that retroactively redacting data may be itself a big data problem and that redacting well defined PII is a possibility on ingest as well as using tools like DataGuise to redact, encrypt, tokenize, etc. such data retroactively can be accomplished as another big data task, but that only handles well known PII. How do we handle derived PII?

Agile Cloud Development

At EMCworld 2013, one of the big stories was Pivotal and it’s importance to the EMC2 family and the future of computing. Pivotal is geared to provide the next generation of computing. According to EMC2 have gone past the Client-Server style to a scale-out, scale-up, big data, fast data Internet of Things form of computing. The real question however, is how can we move traditional business critical applications to this new model, or should we? Is there migration path one can take?