TFD9 Pre-work: Neverfail

Neverfail are privately held, so no spreadsheets, alas.

Their core product is their Heartbeat Failover Engine. It’s Windows only, which is a shame, but hey, it’s a big enough ecosystem to make money.

From the research I’ve done so far, it appears to be a software watchdog that uses an expert system (“rules-base system”) to detect likely failures and move active sessions across to another node somewhere in your network. It also does data replication.

So far it sounds an awful lot like Veritas Cluster Server and its ilk. I’m deeply familiar with VCS from some years ago, having done several significant deployments. It’ll be interesting to see how the Neverfail product differentiates itself.

Devil in the Details

Neverfail is an infrastructure solution. It’s software, ok, but it appears to live at the OS level because of the way they talk about “plugins” for applications.

This is where I digress a little to talk about my own experiences with similar software.

All these infrastructure solutions are based around a particular software world-view: Infrastructure is perfect, so software doesn’t have to worry about hardware.

It’s a complete lie, but all these infrastructure solutions exist so that we can approximate perfect infrastructure. The famous “five nines” terminology is really about how good our approximation is. Why do this? Because when you can ignore difficult issues, you can deal with other complex issues without distractions.

Recall your high-school physics classes. For some of us, it was a long time ago, so it may take a while to page those memories in from long-term storage (assuming they’ve survived). You would have dealt with point-masses, perfectly elastic springs, zero-friction environments. Why? Because friction is stupidly complex and you didn’t have the abilities to deal with it at the time. You hadn’t learned that yet.

Also, point-masses and perfectly elastic springs are actually really good approximations for the kind of problems you were dealing with. The complexities of non-uniform mass densities weren’t important for the problem you were trying to solve, and trying to include them would have made it harder for you to get an answer that was useful.

Moving reliability issues into the infrastructure means you can have a completely different set of people work on that problem, and the software folks can concentrate on purely application software issues. Historically, it was also partially due to infrastructure being really expensive. When you only have one or two of something, you need to take good care of it.

The downside to this is that if the software pretends that the infrastructure is perfect, when the infrastructure inevitably turns out not to be, the software can break in catastrophically bad ways. Data loss, getting out of sync, etc.

But if you take the other tack, and assume that the infrastructure is completely unstable, then every piece of software has to handle all its own failure protection. You end up re-solving the same problems over and over and over. And how paranoid does it have to be? You could end up with every bit of software checking and double-checking every action until, like some poor silicon obsessive-compulsive, it gets stuck inside its house washing its hands for the forty-seventh time this morning.

Like most things in life, you make a trade-off between risk and reward, between useless and perfect, and end up with something, hopefully, “good enough”.

Good Enough

I’m not expecting the Neverfail products to live up to their marketing hype, because if they did, they’d be too expensive for anyone to ever buy. Instead, I’m confident that they’ve built something good enough for the problem they’ve attempted to solve that they can sell it at a profit to customers with that problem.

Now, how well they compare to other, similar solutions, is what I’ll attempt to find out. But similarity is the key here: if there’s another product like a sharded, distributed database that solves the reliability issue in a different way, well that’s not a fair comparison because it requires a completely different application architecture to work.

Similarly, if the Neverwinds product is significantly cheaper than an alternative product that has more features, again, not a fair comparison.

The goal is to see if what Neverwinds has on offer is an attractive enough proposition for their target market, and how much extra value they’re giving to customers compared to their competitors for that same target market.

So that’s what I’ll try to do.

Interestingly, Neverfail apparently license their technology to SolarWinds and VMware. Given SolarWinds’ acquisitive tendencies, one wonders if Neverfail would end up as a target. I should have a better idea of the likelihood after their presentations.

Bookmark the permalink.

Comments are closed