Software Defined Distraction

sdale

Over the past few months, analysts, reporters, and customers have called us (accused us of being might be a clearer characterization) a “software defined storage” system.

I don’t know what this means.  Moreover, I’m not sure that any two of them have the same sense of what software defined storage might, or even should, be.

Software-defined networking is a captivating idea.  Unfortunately, the top-level message seems to be dumbed down to something along the lines of, “the datacenter is changing, so let’s do networking differently”.  As a result, we get a template along the lines of:  Software Defined Thing means “the datacenter is changing, so let’s do thing differently.  Over the past couple of years, we’ve seen pretty much every possible noun that you might reasonably imagine punched into this template: from software defined networks to software defined storage to software defined datacenters.

Here’s the problem: This abuse of “software defined” has become so rampant that it is having a negative effect on the new domains where it’s being applied.  Technical colleagues tell me that they stop listening when someone uses the term, and many of the folks on our own engineering team have pleaded that we not use the term to describe our products.  The term, in these other contexts, is frustrating because it is not leading to useful discussion.  The fact that people don’t know what it means or what goals it might imply makes it completely ineffective as a basis for discussion.  It’s just distracting.

From a storage perspective, this is especially frustrating.  It’s frustrating because storage is changing in a really dramatic way.  I’m really excited about these changes, and how they fit into the broader datacenter landscape.  I also think that there are some really interesting parallels and some significant differences between the changes that storage is undergoing as compared to the areas where the term “software defined” has been applied in the past.

In the next few blog posts, I’d like to talk about some of these similarities and differences.  To start, let’s try to get an understanding of where this term came from, and what it originally meant.

Radio: The mother of software definition.

The first time I remember hearing the term “software defined” was in regard to software defined radio.  It was around 1999, I was a grad student, and SDR was a really interesting topic in systems research: the core idea at the time, and it’s an idea that remains relevant today, was that significant parts of wireless network devices, whether they were 802.11, mobile phones, or other wireless applications, were implemented in hardware on custom ASICs (special-purpose chips).  The SDR observation was that digital signal processing was becoming incredibly powerful and general-purpose DSP chips were available.  SDR argued that it made sense to move many aspects of a radio implementation, such as radio frequency and the way that data was encoded onto that analog signal, into software.  The pitch was that, very literally, software would define the radio.

This was a really powerful idea: the hardware-software boundary on wireless communications was really, really rigid.  Hardware guys built the physical layer and software guys built protocols on top of that.  There were clear examples — things like trying to run TCP/IP over GPRS radio links — where these two layers were clearly at odds, and performance sucked.  By hoisting access to the physical layer up to a level where you could actually program it, the hope was that protocol implementations would innovate faster, that there would be tighter integration across layers of the communication stack, and that devices would have a longer useful lifetime because they could be patched in software to support new protocols.

“Software definition” in this context was a pretty clear goal — it was about taking static and inflexible interfaces (those exposed by radio hardware) and giving software the ability to change them.  SDR has provided an arena for a lot of technical discussion and innovation over the decade and a half since then, including things like FCC comments on SDR implementations, and real SDR products starting to emerge: “Software defined” really meant building wireless devices differently by challenging the static, established layering in the communications stack.

Software Defined Networks

Computer networks have become a lot like the physical radios that motivated SDRs. Both the Internet at large and the enterprise networks that connect to it are static and slowly-changing things.  They are based on a collection of well-established and general-purpose protocols (BGP, OSPF, STP, etc) and they have protocol-based mechanisms for virtualization and isolation (like Ethernet VLANs).

The miraculous thing about these protocols is that they work at all.  It’s mind boggling that we have been able to build networks at the scale of the Internet, using components that are implemented by so many different organizations, functioning as a completely distributed system.  However, they don’t always work well for specific networks or application requirements:  Ethernet and TCP/IP are universal, but switches are painful to configure and maintain, and the resulting behavior of network traffic is more of an emergent property than one that organizations are able to properly engineer.

Software-defined networking makes the observation that the hardware underneath these protocols is much more general-purpose than the interfaces that are currently exposed.  Similar to the case of radios above, the existing protocols and vendor CLIs to the hardware are making them unnecessarily rigid, and there is a similar opportunity to allow higher-level software to define the way that the network behaves.

In a really great recent article in ACM Queue, several leading SDN researchers have summarized the history of programmable networks and called out two properties that “define” Software Defined Networking:

  1. SDN achieves a strong separation between control and data planes in the network as a whole: this is something that was already happening “in the small” as part of switch architecture, but that was lost in the emergent large-scale systems that resulted from switches being connected into networks.
  2. The control plane is consolidated in a manner that lets you actually program it.  The result of this second property is that unlike the current set of approaches to configuring packet-switched networks, where a carefully designed protocol like BGP or spanning tree runs on every device and results in a network-wide property, a single central program can be run with a view of the entire network.

In combination, these properties lead to a really interesting change in the way that networks are built and managed: rather than being a whole bunch of independent devices that need to be individually configured and tuned to achieve some outcome, SDN allows a consolidated point of control, with a view of the entire network, to make smart, global decisions.  One example of the power of this approach is in the way that Google has taken advantage of SDN to build and manage an enormous wide-area internal network.

At this point, it’s worth pointing out a couple of things that SDN and SDR share:  First of all, there is the property that the lowest layers of the stack in both of these contexts are “stuck”.  There’s a great opportunity to make them better by exposing more of them to software, but that requires significant change in order to achieve.  Second, in both of these cases the “software defined” term articulates a goal and has led to an interesting and constructive discussion about how the technology should change and evolve.  The open development of protocols like OpenFlow and public consortia like the OpenDaylight project have led to a lot of discussion and progress as to what SDN actually is, both in research and industrial contexts.

Software Defined Storage

If there is a clear and common theme between SDR and SDN that might apply to storage, it’s got to be this same notion of “stuckness.”   In the enterprise, storage has always been a device.  A container.  A thing that you buy from someone and put your data into, at least until that device gets old and needs replacing.

Storage has gotten this far through the standardization of a few block- and file-level protocols, and shares with networking a sort of lazy ossification that follows from the wash, rinse, and repeat cycle of vendors building basically the same products for at least two decades.  There is an opportunity to shake up this rigid layering, and to ask more of storage implementations.  And maybe, just as with networking and radios, there is an opportunity to make storage turn a significant corner and to open up new opportunities for applications that depend on it.

In the next few posts I’ll argue that storage needs to change in some very specific ways in order to continue to solve problems, and to pull its own weight in the evolution of the datacenter.  In particular, I’m going to talk through the following points:

  1. “Software Defined Storage” does not mean “storage shipped as a piece of software”.  This is a matter of packaging, hardware qualification, and customer preference.  Instead, focus should be on the challenges that we face today in scaling, evolving, and presenting storage resources as an elastic utility.

  2. “Full convergence” of storage and compute is a lazy ideal.  Unless all your workloads use exactly the same amount of storage, compute, network and RAM, the expectation that you will be able to economically resource them with a collection of cookie-cutter bricks is probably not the right way to think about either scale or hardware investment.

  3. Improving storage means enabling new things, not just doing a better job of old things.  SDN is making networks easier to manage and solving some really tedious operational challenges.  It’s also enabling entirely new applications that simply couldn’t be built before.  Storage is a victim of its own aging protocols — these need to be supported, but applications should be able to drive interesting new functionality as well.

Storage is changing, whether we call it software defined or not.  Let’s talk about what that change needs to look like, and what problems it needs to solve.

Note: The image in this post is an edited version of a McEwan’s India Pale Ale coaster, based on the original image that is available on Wikimedia Commons.

Interested in learning more about Coho and our products? Check out ESG’s report on our initial product offering, or our slightly gorier technical white paper that describes the system in a bit more detail.

26,550 total views, 3 views today