How to Build the Ideal Storage Solution: Part 1 – Hardware
Last week at VMworld 2014 I met Andrew Warfield the CTO and co-founder of Coho Data. We were both heading to the Expo center one morning to meet with others and to prepare the booth for that day’s activities. As we walked I asked Andy one of my favorite questions for any innovator:
“How did you come up with the idea for the Coho Data storage solution?”
I wanted to know what the moment of genius was that gave birth to a true web-scale storage solution. Was he watching a child stacking blocks and suddenly have that flash of insight that storage should be modular? Did he do one too many storage controller upgrades and knew that there had to be a better way? Was he bitten by a radioactive salmon and suddenly given incredible storage super powers?
The answer was a humble and simple “I’ve just been working on these problems for a long time.”
While it’s not as cool to write about as radioactive salmon induced superpowers, as a sales engineer and former IT administrator this is the kind of answer that I want to hear. Knowing that a lot of thought and effort has been put into a product’s design means that you can have confidence in that product as a manufacturer, a salesperson, and most important of all as a customer. Andy and the Coho Data team did not just come up with a new and nifty widget that might have the potential to be useful. He and his fellow engineers set out with a mission to build the ideal storage solution for today’s virtualization and cloud based workloads.
And they started with the hardware.
Traditionally storage arrays have two bottlenecks. The first is the controller which has to process all of the storage transactions in the form of reads and writes, and the second are the connections between the array and the applications that produce those storage transactions. In order to ensure that a storage solution would be adequate for future growth you would design custom hardware that was supposedly more powerful and faster than was needed. You also built dedicated physical storage networks that, again, were supposedly much more powerful and faster than would ever be needed. This approach worked great if your data growth matched your hardware refresh cycle, and not too long ago that certainly was the case.
But then virtualization appeared in our datacenters, and suddenly the traditional approach to storage design was proven inadequate. We could easily use our compute infrastructure to its maximum potential now that multiple VMs could access and share the same CPUs and RAM, but storage was still being provisioned using a methodology based upon physical servers accessing a single isolated volume. Bottlenecks began to appear at both the controller and storage network layers. The virtualization revolution was being stalled not by expensive new technologies, but by expensive old technologies. The traditional storage array using custom hardware was just not designed for today’s IT infrastructure.
Here comes the commodity train!
Back to the Coho Data solution, and how hardware played an important role in solving the scaling problems most storage solutions fall victim to. Instead of relying on expensive single points of failure in the form of custom built hardware controllers Coho Data uses a commodity hardware based solution that has two fully redundant micro-arrays. CPUs and memory hardware have improved massively in performance while at the same time dropping in price that makes custom hardware no longer cost effective for the results delivered. You can use off the shelf components to build a very budget friendly solution that performs better while lowering the overall risk.
There is another huge benefit to a commodity based hardware solution; You can quickly assimilate new technologies into your designs in order to deliver a better product with a shorter time to market. Which brings us to the decision to use dual PCIe Flash storage cards in each Coho Data microarray. PCIe Flash is the best performing storage hardware commercially available today. PCIe Flash delivers more IOPs per dollar than traditional disks or even SSDs. By leveraging the performance of two PCIe Flash cards in each microarray we now have eliminated the controller performance bottleneck. And while PCIe Flash cards are not yet cost effective for storage capacity, each Coho Data microarray has six large capacity SATA drives so that you get the best of both worlds with blazing fast performance and high capacity. As the market produces faster and larger capacity hardware you will see newer Coho Data microarrays incorporate those components as well.
But what about the bottleneck between the applications and the storage controller? Here again the Coho Data team took an incredibly innovative approach using commodity hardware. Since each microarray has two 10GbE ports why not use a 10GbE switch? For years our data centers have been built using Ethernet technologies for our networking needs, so it just makes sense to use that same technology to provide our storage networking needs as well. Not only we do reap the benefits of 10GbE speeds for all of our storage needs today, but the same commodity hardware based approach protects us moving forward when 40GbE and 100GbE solutions begin to emerge. Coho Data will simply produce new micro-arrays that take advantage of these new speeds as they become available.
So how does this help me?
This is the kind of innovative solution you can expect when experts in a field have been working on the problem for a long time. The traditional monolithic array approach with all of its limitations is abandoned. Instead a fresh new way of approaching storage hardware as needing to be modular is embraced, and the result is a manufacturing process based upon quickly adopting the latest technology instead of locking customers into a storage purchase for the next three to five years.
Of course, for this approach to work it takes more than just hardware. A robust software architecture is needed so that all of the Coho Data microarrays work together to provide scalable high performing and high capacity storage to the entire datacenter. To learn more about how the Datastream OS and Cascade Technology combine with the power and economics of affordable commodity hardware be sure to check out my next blog post on “How to Build the Ideal Storage Solution: Part 2 – Its the software stupid!”
7,251 total views, 5 views today