In a world that’s gone software-defined mad, X-IO Storage are a refreshingly honest company seemingly devoid of hype. They make storage hardware, and specialised hardware at that.
For the software-defined cheer-squad, this might sound like madness, but it’s not.
First of all, software has to run on something, and that something is called hardware. Many companies laud themselves on being made up of commodity components, but just throwing a bunch of commodity components in a box doesn’t make it a commodity. If it were, there would be no such thing as a hardware compatibility list. Check to see if your favourite so-called software defined vendor has an HCL. Just how software defined is it, really?
The problem with a commodity is that it’s hard to distinguish yourself from any other vendor. What’s the difference between brand-A flour and brand-B flour? Flour is flour, right? Well no, not quite, but the distinctions are subtle, and not everyone cares.
Hard disks aren’t even that much of a commodity. They sell for different prices, and they have different failure rates, as Backblaze has found and published earlier this year.
And for X-IOs customers, who are largely service providers or cloud service providers, storage reliability is important. Note that Backblaze have redesigned their storage pods to reduce vibration and increase drive life, reliability, and replaceability. Read through that post and realise just how much work goes into a robust, reliable storage system design. Storage is about far more than just the drives.
The X-IO Advantage
What X-IO have done, with over 10 years experience, is to build a predictable, reliable storage system. It’s not the fastest, or the cheapest, but optimising for just one of those things is not what X-IO’s customers need. X-IO is aiming for a specific zone where they trade off performance, capacity, reliability, and price, and they were very clear in their presentation what that zone looks like.
My favourite part of the ISE design is to view the failure domain HDDs not as whole drives, but as the ‘surfaces’ of the platters inside the drives. As X-IO told us, when a disk ‘fails’, 50-85% of the time there’s nothing wrong with it. Instead of failing the drive, they use their more intimate knowledge of what’s going on, gathered with custom firmware, to fail only a single surface if that’s what’s wrong with it. Their systems can also reset drives, reinstall firmware, and do other things to remove the need for failing individual drives. Instead, the disks are grouped into datapacs that are only removed if sufficient spare capacity is used up to replace failed surfaces.
With a warranty period of 5 years, that’s a lot of stability for customers who may have hundreds of these systems in their datacentre. No more employing someone just to wander the halls replacing dead drives. No more securely shredding drives that are still 90% usable. It saves customers money, and it saves X-IO money too. It also saves customers aggravation from constantly replacing failed cheap commodity components.
What About Flash?
This approach is all very well for spinning media with platters and surfaces, but what about flash? Well, X-IO include flash in their higher-end arrays for acceleration, and the level of integration with SSDs doesn’t match the HDD integration (custom HDD firmware), at least not yet.
But notice how internal ‘surface’ failure of HDDs, and ‘remanufacturing’ is very similar to the way flash is deployed today. SSDs contain more storage than the advertised capacity; a 100GB flash drive can have 128GB of flash actually installed, giving you 28% spare capacity as ‘hot spare’. Flash firmware is really complicated in the way it does wear-levelling, and how it chooses when a cell is bad and should no longer be used (such as SanDisk’s Guardian technology). It’s broadly similar to how bad sectors are handled with HDDs, or bad ‘surfaces’ in the way X-IO does datapacs.
I can see X-IO getting close to a manufacturer of flash in the same way they partner closely with HDDs manufacturers in order to make flash more reliable and predictable. Why not aggregate the spare capacity of individual flash modules across an entire datapac? Over time, I can see all-flash datapacs being introduced to create even higher-performance arrays, and with even greater reliability than HDDs.
Know Your Place
Another thing I like about X-IO’s focus on hardware is that they just make the storage be great storage. They don’t try to build in a zillion software features that are better left to somewhere else in the stack, like hypervisors for virtualised environments. That saves money, because you don’t double up on features in multiple places in the stack, and X-IO can concentrate their energies on making great storage, not HA/DRS systems.
In some ways, this is like having commodity disk. You leave all the smarts to software running higher in the stack, and have the physical gear be really cheap and stupid. The problem with really cheap, stupid physical gear is that it breaks a lot, and you spend a lot of time organising swapouts. If the physical swapout was as automated as the scale-out cluster failover software was, it wouldn’t be a problem, but we don’t have datacentres full of conveyor-belts ferrying new hardware into position. Humans have to do it, and unlike infrastructure, the price of labour keeps increasing.
Highly reliable gear means you don’t have to automate the swapout and failover. That’s what infrastructure has been trying to do for decades: pretend hardware never fails so the software can be simpler (it doesn’t have to handle all the failure modes of hardware). With smarter software, hardware can fail more and the software will still work.
Hardware will still fail, though, and still require replacement. Getting the right balance between reliable hardware and fault tolerant software is the name of the game. X-IO help companies concentrate their efforts on places higher in the stack where they can deliver differentiated value to their customers. The storage can be installed and just sit there and run, for years at a time, in a nicely predictable manner.
I think X-IO have some good technology here, and an approach that is valuable and different from what others are doing.