I had the privilege of attending a presentation on dTrace by one of its authors: Bryan Cantrill. Bryan is a serious ubergeek and dTrace is the best thing to happen to Solaris since.. well, since I can remember. I am in awe.dTrace is insanely great. It’s a serious piece of software engineering that’s obviously been put together by some seriously smart guys. Not only that, these are seriously smart guys who have suffered under the same conditions as their target audience: systems admins who have a production server in trouble and it’s not obvious what’s wrong. They’ve used all the standard tools we all use to diagnose problems: top, mpstat, truss/strace, lsof, etc. They know how much it sucks trying to figure out what process is reading or writing to what file on what spindle under Solaris 8. They know what a drag it is when you’re trying to figure out what’s eating all your resources by using top or truss only to have them report that what’s eating all your resources is top or truss. But no more.
dTrace lets you dig into the stupid little details (like truss -x all -v all) without sending your load average up over 100. You can progressively dig into what’s actually going on with a running process a lot more easily with just a single tool instead of having to build a collage with a bunch of different views from different tools. There’a a lot less guesswork required and it’s a lot faster to test out a hypothesis. Using dTrace will significantly reduce the time taken to fix problems.
Bryan is a pretty good presenter, too. He looks like the prototypical geek: glasses, tousled brown hair, slightly awkward looking stance. When he speaks, the words tumble out in a cascade, his brain running several times faster than his mouth. He’s excited about what he’s doing and obviously having a lot of fun doing it. He typed out long lines of arcane dtrace commands like a virtuoso, which I suppose is to be expected from one of the people who invented the thing. He threw in amusing anecdotes to keep the audience engaged while also explaining the most arcane kernel facts.
I learned a lot from the presentation, but I suspect that of the 30-40 people in the room, at most a dozen really understood what was said. This was not a talk for managers or novices. Bryan provided interesting business related anecdotes about how dTrace would help save money by fixing problems quickly, but most of the time he was geeking on. There were obscure jokes about linker manuals, references to previous versions of pfiles and discussions about thread locking bugs, most of which flew over the heads of at least half the audience. It was great.
My colleagues and I had a chat to Bryan afterwards about a bunch of topics close to our hearts at the moment: storage performance, NFS POSIX file locks, ZFS and FireEngine. He knows a lot about a lot of things and is more than happy to chat to likeminded folk. He’s the sort of guy I’d love to work with, and I envy that he’s having a lot of fun doing something he enjoys doing and getting paid to boot. One day I hope to do the same.
Oh, and he runs gnome on XOrg under OpenSolaris on amd64.