Turning the SAN ‘Inside-Out’?
An article on eWeek caught my eye this morning. Seanodes, a Paris-based startup, has announced a product which, as they put it, “allows users to share disks and arrays directly attached to, or embedded in servers as if they were part of an external array.” The idea is that just as you use server virtualization to better utilize excess server processor resources, the Seanodes system will allow you to aggregate and better utilize excess directly attached storage.
In general, I’m not a fan of putting a large amount of storage into a server. Typically, I only use internal hard drives to boot the operating system. I put application data on some sort of centralized storage array. I’ve found that while the acquisition costs of external storage arrays are often higher than using internal storage, the flexibility afforded by decoupling storage from the server is invaluable. Lately, I’ve been experimenting with even moving the boot media off the server.
But let’s go back to the Seanodes Exanodes product. At this point it appears to be supported on Linux only and is aimed at cluster computing. Unlike clustered filesystems like IBM’s GPFS or RedHat’s GFS, which allow multiple computers to access the same files, Exanodes works at the block level. The software aggregates bits of storage from all participating nodes, and provides a synthetic hard drive to a using system. This means that while the hard drives may be shared across multiple systems, the data contained within the synthetic hard drive is not shared. While storage capacity is aggregated and distributed, the data still belongs to a single server. It’s just potentially scattered across several servers.
Seanodes has built availability into the system. Every block of data stored inside of Exanodes is replicated across at least two participating nodes. Seanodes calls this RAIN which expands to “Redundant Array of Independent Nodes”. While this insulates data access from a failed or powered-down node, it also cuts available storage capacity by at least half. The redundancy overhead is even higher if you use RAID arrays as the underlying physical storage.
Seanodes has an interesting concept. I can see a fit for it in the commodity Linux cluster space. I’m unconvinced that it is useful in a more general computing environment. Centralized storage is becoming cheaper by the day and decoupling data from the server is the very heart of virtualization.
(Disclaimer: I am basing my analysis on the information published on Seanodes’ website. I have requested their whitepaper, but have not yet received it.)