CFD1 Prep: Scality

scality_logo

I was only peripherally aware of Scality until last week when I attended the Scality presentation at Tech Field Day Extra at VMworld 2016. I came away impressed.

Jerome Lecat is an entertaining presenter, but the product is what impressed me. Scality make a software-only scale out storage solution called the RING, so-called because of the ring-topology at the heart of its architecture.

I’ve dug into the details courtesy of a technical whitepaper you can get from the Scality website for the low, low price of fake contact details. It’s a fairly straightforward multi-layer architecture where each software layer performs a specific function.

The protocol at the core of the RING architecture is based on Chord, introduced in a 2001 paper from MIT [PDF]. It is reminiscent of other scale-out protocols like PAXOS or Raft, but it seems to focus on an ability to scale to very large number of nodes without needing to know about the overall state of the network, just the status of a subset of nodes that are nearby. Scality have made their own extension and adaptations to Chord, and used it as part of the overall storage service.

Layered on top of this core functionality is some policy based data protection (replication and erasure coding based) and self-healing capabilities. The erasure coding implementation keeps the data chunks intact, adding parity data, rather than re-encoding the data as intermingled data+parity chunks. This speeds up reads, and means the relatively costly erasure coding calculations only need to be performed on rebuilds.

Accessing the storage is performed through an access layer (funny that) using Connectors. Scality has an object storage heritage, but the underlying object store also has a native scale out filesystem called SOFS that uses an internal distributed database called MESA to store the file metadata (inodes, directory hierarchy information,etc.). It’s not clear to me how SOFS/MESA and the Chord keyspace and distributed hash table interact, so there’s something I can ask during CFD1.

Scality uses an AWS S3 compatible API as well as its own native REST API for object access. AWS S3 is the de facto standard for object storage access now, so we should all just settle on the basic semantics of the protocol and move on. Scality also supports OpenStack SWIFT and Cinder, but also Glance and Manila somehow, which intrigues me. I’m not an OpenStack guru, so I’ll be interested to hear more about how Scality interacts with OpenStack in these different ways.

Scality also supports NFSv3, SMB 2.0, and Linux FUSE for filesystem access, which talks to SOFS. Scality claim this is an improvement over some competitors that use a gateway approach to filesystem access, but really the Connectors are a gateway to the underlying system. The gateway mechanism is just baked into the product, so yes it probably does provide some advantages, but again I’d like to know more.

It’s still pretty great that the one system can speak all of these different protocols. There’s no block access, but I’m ok with that, because LUNs are stupid and need to die.

RING v6 adds a bunch of enterprise features as well, such as Identity and Access Management through Active Directory integration, and Single Sign On via SAML assertions. These kinds of features make systems more attractive to enterprises because they can interoperate with the existing infrastructure, processes, and systems that an enterprise already has.

Newfangled startup things are fine for point solutions where you really need something new, but once you start extending into the rest of the organisation from your comfy toehold, you start needing to play well with others.

Scality are definitely one to watch, and I look forward to learning more about them.

Bookmark the permalink.