Chad Sakac mentions on his blog that VNXe “uses a completely homegrown EMC innovation (C4LX and CSX) to virtualize, encapsulate whole kernels and other multiple high performance storage services into a tight, integrated package.” Well this has gotten me to thinking about other uses of VNXe. If EMC could manage to “refactor” or encapsulate a few more technologies, I think we have the makings of a killer virtualization security appliance. Why would a storage appliance spur on thinking about virtualization security?
VNXe brings to the table several all important items to solve one of the more interesting virtualization security problems facing people today: The problem of large scale multi-tenant forensics. Yes, I know that Terremark is approaching a solution to the problem using off the shelf components such as network based EnCase, NetWitness, and the usual line up of well understood tools for digital forensics. However, I believe these tools fail to grasp the very nature of the cloud and virtual environments and the proliferation of forensic data.
The core to digital forensic analysis is the determination of an accurate timestamp associated with the data within the digital device. Since this can be manipulated fairly easy with common tools, an external timestamp is often required (with a mapping back to the timestamp within the machine). What if a device could provide this timestamp? Such a device could also provide the same set of data at different points in time so that you can also determine what changed. Now that would be very very cool from a forensics perspective. The solution is two fold, the first is to implement VPLEX as a virtual appliance on a VNXe, so that I can make a small device with quite a bit of power which a forensic discovery team could wheel into a data center. The VNXe would act as a go between ensuring chain of custody and hashes of the data to be duplicated.
How would it do this? This assumes that VPLEX with SRDF underneath it makes a block by block duplicate of the data that is proven to be forensically sound, and that each block or set of blocks has an appropriate hash associated with it (normally at a file level). VNXe would be the cloud VPLEX component which would then send data to a secured remote datacenter purely to be used for digital forensic analysis. Think of VNXe with VPLEX as a datastore to which you would Storage vMotion (SVMotion) the data to, which would be then duplicated at the secure location. The second piece of the puzzle would be able to duplicate in use memory of the tenants virtual machines at will, and duplicating that data to the local VNXe drives which happens when you perform a vMotion or at an administrators whim. The third piece of the puzzle would be to grab vLockStep data to ensure you can replay the current CPU processing at a given point in time. The fourth piece of the puzzle would be the ability to grab historical data of where within the storage array tenant data has ever lived.
The really cool aspect of this approach is the nature of SVMotion, vMotion, and vLockStep is that the GuestOS does not KNOW it is happening, therefore a suspect would also not know this is happening. No need to power off systems, put them to sleep, or use agents within the VM. Everything happens externally. However, for this mechanism to work, the tools and processes need to be proven and improved. Questions still on my mind:
- Can we prove VPLEX provides a block by block duplication of storage data?
- Can we prove that SVMotion, vMotion, and vLockstep are undetectable from within the VM?
- Can vLockStep be modified to log all lockstep data instead of running a secondary VM or does it make a difference? If we are just logging then timing data could be inserted and therefore handle multiple vCPU VMs?
- Can SVMotion and vMotion be used to capture data but never complete the full transfer, much like the way FT setup works? Or is this another aspect of FT that can come into play to perform forensic acquisition?
- How fast can we transfer the data?
- Can clouds be designed to ensure Tenant data sits on different datastores within the virtualization environment?
- Can we track where each Tenant’s data has resided at any time during its lifespan within the cloud’s storage subsystem?
#7 may have an answer with the Akorri Balancepoint tool as it can track this information down to the spindle, yet, requires you to be running Balancepoint from the beginning. Will it be possible to ‘refactor’ 3rd party tools into the VNXe so that we can get better forensics?
As you can see, VNXe has spurred on thoughts about creating a new family of tools to solve a pesky problem, but can it be used effectively? Can the method be proven to be forensically sound, and will cloud providers prepare for such a process. I believe so!