DFD1 Prep Post: Sandisk

Sandisk did a great job with their presentations at Storage Field Day 5 (which is where the SFD5 group photo was taken), so I’m looking forward to seeing what they have to say that relates to data.

Sandisk are primarily a flash storage company, and I gave a potted history of the company as preparation last time. They’ve stayed impressively focused on what they do, even recently spinning out NexGen Storage as a separate company rather than hold on to something they don’t think fits with the main organisation. NexGen was acquired by FusionIO, which they renamed to ioControl before Sandisk acquired FusionIO. Sandisk clearly didn’t see the NexGen stuff fitting with their plans, but rather than just dissolving it internally, they spun it out.

This impresses me, because the ways this was done shows an impressive self-knowledge. Corporate strategy should be about making sure that owning a company (or division, or group, or whatever) means it’s more valuable being owned by you, rather than by someone else. Anyone else. And if NexGen does well, it increases the market for Sandisk’s products, so why not let them go off and try? (Disclosure: I’ve done some writing for NexGen in a commercial capacity, so I’m biased to liking them).

What Will We See?

I’m intrigued by what Sandisk might present to us. They have a bunch of flash products, including the recently announced InfiniFlash system (which has quite a snazzy bezel, in my opinion), but vendors flogging product isn’t interesting to people running a business. If you’re just buying kit because it’s shiny and new, you’re either trying to show off (Apple Watch, anyone?) or it’s a hobby (like all the microphones I seem to have now). When you’re running a business, you buy kit to solve a business problem, if you’re smart (and not using your employer’s money to fund your hobby/need to show off).

And this is about where flash vendors and I usually start to disagree about what they’re doing. They bang on about cost savings, and while yeah, you can save a few quid on power and cooling, that’s not why you buy flash. It can take up less room than older kit, but that’s always true of new gear (ENIAC was 167 m²), and unless you can give the space back (like if you’re renting it at a colo space) you don’t save any money. Not being able to fill the space with something else is an opportunity cost, but it’s not cash flow, while shelling out for new kit is.

Flash is great because it’s fast. And when you’re doing analytics, that’s important. The cost saving really kicks in when you compare flash to what getting that sort of performance out of spinning disk would cost you, or what that amount of storage would cost in RAM. I don’t need a lot of storage in my laptop, but fast storage makes the experience qualitatively different. Similarly, the flash drive in my lab (thanks to Micron for the gift at VMworld 2014) makes processing audio, video, and large datasets much easier, and in a way that’s more than just faster. I can attempt to do things that I wouldn’t have tried to do, because I can.

That’s the fundamental benefit of flash. I wish more vendors concentrated on that message instead of trying to make flash seem the same price as spinning disk with tricky statements about dedupe rates and ‘effective’ storage.

Storage is Slow Memory

Sandisk have Memory Channel storage, courtesy of their Diablo Technologies partnership, and now that the lawsuits with Netlist are resolved largely in Diablo’s favour, Sandisk can get on with selling their ULLtraDIMMs. It’s still a bit of a halfway step, thanks to the SATA bridge inside the device to link the flash to the DIMM connector, but it’s a step in the right direction.

An ULLtraDIMM

When you want to do analytics, you want it to be fast. There are a bunch of in-memory database and analytics technologies (like SAP HANA, MemSQL, and others) designed to use the fastest storage we have, which is memory. The trouble with memory is that it isn’t persistent; if you lose power, you lose all your data.

That’s bad.

Flash is great because it’s orders of magnitude faster than persistent storage on spinning drives, but isn’t as many orders of magnitude expensive per-GB as memory. That makes it a good trade-off for large-ish datasets (that you need for interesting data analysis) that can be processed quickly, but that you won’t lose halfway through a large processing run because of a power glitch. By putting a subset of your data on flash instead of spinning media, you can process it cost-effectively.

Flash isn’t so great for long term storage, because it does still wear out a lot faster than spinning disk or tape. It’s not cost-effective to use it for longer term storage, depending on what “longer term” means for you. Flash is currently in an economic sweet spot where it’s fast enough, cheap enough, and durable enough that it’s worth using for a lot of things that we used to put on wide-striped spinning disk until very recently.

But we still use it like storage. We access it via filesystems designed because storage is so far away from the CPU. If the storage comes closer to the CPU, we can use more memory-like mechanisms to access it from our programs (MOV AX,4, or memcpy(), or mydict = {}, instead of fp = open(“/my/filepath”)).

What’s Next?

I’d like to see something from Sandisk about how we’ll start to write programs differently to make use of persistent memory. That’s going to be a massive shift for today’s programmers who are used to all memory being temporary. We’ll need new primitives to make it clear that we want a chunk of persistent memory. permalloc() perhaps? Maybe they can tell us how people are starting to use ULLtraDIMM in practice, now that it’s been available on the market for a while. Are people doing some interesting analysis jobs using the stuff? Has it influenced the way database or application design is being done?

My suspicion is that we’ll spend quite a while with some sort of bridging mechanism, for the same reason ULLtraDIMM has a DIMM/SATA bridge. There’s just so much stuff already out there using the existing methods that we can’t change it all at once. If we want to take advantage of the new technologies, we need a way to bring all the now legacy stuff along with us.

Hopefully it won’t last too long, or cause too many new issues. The last thing we need is another way to keep Windows NT3.51 applications alive.

DFD1 Prep Post: Sandisk

What Will We See?

Storage is Slow Memory

What’s Next?

One Comment

More from me

Archives

DFD1 Prep Post: Sandisk

What Will We See?

Storage is Slow Memory

What’s Next?

One Comment

Popular Posts

More from me

Archives