Komprise is Klever

Komprise is essentially an auto-tiering solution for file data. It can index metadata from primary file data and answer questions like “How many video files on my filers haven’t been accessed in six months or more?” and you can move this cold data to lower-cost storage, automatically or manually. That lower-cost storage can be local or remote, including in the cloud, and it could be S3 object storage, or NFS/SMB storage.

For organisations with large amounts of unstructured file data laying around, this can provide substantial operational savings. It’s also handy for security because you can’t secure what you don’t know you have. Knowing what data you have, and where, is pretty important.

The flexibility Komprise provides, combined with the very readable dashboard interface, makes it a simple and straightforward way to manage unstructured data.

The way Komprise handles the movement of data is quite nifty as well.

Transparent Move Technology

Komprise moves data using a clever indirect reference technique that tricks a regular filesystem into helping it do its magic.

When Komprise moves a file to some other storage system, it replaces the original file with a symbolic link (or symlink) which is a standard part of the NFS and SMB standards. The symlink is a pointer to a Komprise Access Address (KAA) which is an NFS or SMB target on the local Komprise grid. So far, this is basically the same as if you moved a file manually and then created a symlink to the new file location, or used an HTTP 301 redirect response on a webserver to advise clients looking for the file where it lives now.

The Komprise grid has an NFS/SMB interface that makes it look like a filer, which the client talks to after trying to find the original file, and the Komprise grid then proxies the access to the original file. Behind this interface is the Komprise Cloud Filesystem (KCFS) that maintains a mapping between the KAA and the actual file data, wherever it happens to live.

This is what allows Komprise to move SMB or NFS data onto an object store like AWS S3 but still serve data to SMB clients. It also responds with the original metadata for the file, so the file data looks just like the original for all intents and purposes.

This process of moving data means Komprise can be easily inserted into an active data path without having to remap data or reconnect drive shares or anything. These kinds of remappings involve outages to upstream systems, so they tend to be harder to implement than this kind of drop-in deployment.

The File Is An Object

When Komprise moves files, they become a single object on the remote S3 store. This has a couple of advantages over alternate approaches that use S3 as block-addressable storage for a filesystem abstraction.

Firstly, you can access the files directly via S3. This is quite handy for certain applications that only speak S3, and Komprise can create copies of data—rather than just moving it—so you can provide a secondary copy for analysis applications, for example.

It also provides good resiliency, as Komprise can completely rebuild its KAA directory from the S3 copies as the Komprise grid is essentially stateless. Other approaches become highly dependent on the indirection database continuing to exist, and if you lost the mapping data you lost the filesystem data, which could be catastrophic. Older style backup systems tended to function this way, with their catalog of which tape held which data.

It also means you can move NFS data onto an SMB backing store, or vice versa, and Komprise performs the intermediate lookup.

Plays Well With Others

The hot data remains on the primary system, so Komprise doesn’t affect the performance of the data you need most, and if moved data suddenly becomes needed again, you’ll still be able to access it. If you find you’re accessing previously archived data a lot, it can be recalled back into the primary storage and Komprise will get out of the way completely again.

The ability to undo and remove Komprise cleanly and easily like this deserves some major kudos. As does being able to bypass Komprise if you want to access the migrated data directly. You’re not as beholden to whatever Komprise decides you should be able to use the data for once it’s been touched by the system.

Doing what I need it to do, and then getting out of my way, is a design ethos that I wish more vendors embraced. Having to wrestle with the whims of whatever a vendor’s product management team has decided is the Next Big Thing is particularly frustrating when you’re in an enterprise team that thinks in terms of 5-10 year planning cycles.

A new CIO could mean a completely new approach to things that has nothing to do with whatever we did 3 years ago and data migrations are particularly fraught affairs. The whole point of data management tools should be embracing the need to change where data is at any given moment, and making that migration process as painless as possible.

Komprise is one of the first tools I’ve seen that seems to do a good job of that, and it’s well worth investigating if you have a lot of unstructured file data to manage.

Bookmark the permalink.