In part one of this article, I lamented the state of our enterprise storage arrays and talked about the features we absolutely need on any new arrays bought this year. Why the lament? Because this is 2015, and we’re tired of the 1995 technology we’ve been using. When you send out your RFPs this year, the following are things you should score vendors on.
Part 1 of this multipart article covered Quality of Service (QoS) at the array level, as well as using flash, and not just cache, as primary data storage. Additional items to insist on include:
Tunable Autotiering and Granular Autotiering Reporting
Organizations often use different arrays to achieve different price points for their data storage. Active data goes to SAN-based arrays. Colder or less important data gets written to a NAS. Eventually, data gets spooled out to tape or near-line storage using some form of hierarchical storage management.
What a pain.
An organization doing this has to manage a whole bunch of different arrays, be trained on them, and have multiple ways of attaching to them. It has to pay for hierarchical storage management and data lifecycle management software, which aren’t cheap. And it has to copy data all over the place just to save some money. Organizations that do this sort of thing are generally some of the most risk-averse institutions around, but all these schemes are inherently risky. How do they know the data got copied correctly? How much human error could accrue in the course of managing different arrays?
Autotiering helps with this. It allows you to throw a ton of different types of storage on a single array, and the array figures out whether data should go on the cheap and slow bulk storage or on the expensive and fast flash. You have one array, one vendor, one type of training and documentation, and one type of connection mechanism. Easy, simple, cheap. The problem is that most autotiering doesn’t cope with sudden changes in usage profiles. Nor does it let you move data around dynamically, to compensate for usage profile changes or other system design changes. For example, if you’re using an autotiering array for a DR site, you may have some absolutely horrible performance at that site if you ever fail over to it, because all your data will be down in the slow bulk storage. How fast can you promote the data to the next tier? Organizations need more control over this process.
Organizations also need very granular reporting on autotiering, especially to diagnose performance problems. Many arrays autotier at the block level, so a large data object, like a virtual machine’s disk file, will end up on multiple tiers of storage. That can make for some wildly variable performance. Storage reporting and management tools are just not up to the task of helping IT staff diagnose performance issues like that.
Some vendors’ architectures mitigate these issues, adding large read caches on their controllers to help offset performance problems. Definitely scrutinize these approaches, and as with other items on this list, don’t let a vendor tell you negative things about its competitors—it’s usually incorrect information.
Simple and Effortless Management
I’ve often asked vendors why they hate me. “Hate you?” they reply. “What do you mean?” I then point out that they force me to use horrible, error-causing user interfaces to manage their products. These interfaces are designed by engineers to meet nobody’s needs. They’re difficult to use, require obscure and insecure versions of Java on clients, require separate management servers and licensed software, and cause outages and data loss when inevitable mistakes are made. Similarly, array maintenance tasks are generally horrible. Code upgrades are a long, arduous process—the EMC XtremIO full outage upgrade is the worst-case example, while Nimble Storage and Dell Compellent are some of the best-case examples. Scheduling maintenance operations like deduplication, replication, and snapshots is difficult. Adding storage requires a vendor technician. It’s all horrible.
You, as an IT person, are the user of these products. Demand more. Look at products like Coho Data’s DataStream, which has all the management onboard, requires no extra licensed software or management servers, is browser-based so you aren’t fighting with Java, and is streamlined to make routine tasks like growing the array and viewing usage reports very accessible and easy to do.
Individual servers are huge nowadays. The recent Dell PowerEdge R730s can have thirty-six cores, seventy-two threads, and 768 GB of RAM—and that’s just two CPU sockets. For a remote office or for a small or medium enterprise, a pair of these servers can run everything. The problem is that no storage vendors support directly attaching hosts to their array. They all insist that we purchase and operate a SAN, be it Ethernet-based or Fibre Channel. Many say it can be done, but then hedge and say it’s unsupported. Good luck calling support if you have a problem, then.
A SAN involves a complexity and cost I don’t need and don’t want, especially in a remote office. Imagine how simple life would be if I just could run cable from my host to my array. I wouldn’t have to worry about monitoring and supporting extra switches. I wouldn’t have to worry about someone making a change to a switch and killing my storage. I wouldn’t have extra cables to get bent, pinched, or damaged. I’d just have four $10 Ethernet cables from my two hosts to the array.
The only vendor I know that does something like this is Coho Data. Because its DataStream uses 10 Gbps switches internally, you can cable directly to it. That’s wonderful, because this is one of those overlooked details that makes a huge difference in the total OPEX of a system design. Of course, the problem with DataStream is that its minimum entry cost is significantly higher than that for a Nimble Storage array or Dell Compellent SC4020. If we all start asking vendors for these things, then perhaps on your next RFP we’ll start seeing them.
As always when putting together a list, I had to be selective. What have I missed that you find absolutely crucial? Let us know in the comments!