Why You Should Stop Using Multiple Redo Logs with Oracle on NetApp

Using multiple redo log areas is something that many Oracle DBAs are used to doing. This post explains why you should stop doing it if you use NetApp storage for your Oracle database.

What Would I Know?

Once upon a time I was an Oracle DBA. I’d never done it before. I just got lumped with the job because there was no one else, and I knew Unix.

I learned fast because I had to. This was very much the deep end.

I devoured the manuals, and eventually, specialist texts on performance tuning. I ended up meeting and working with the authors later on, which was quite a thrill.

For the past 5 years, I’ve spent a lot of time designing computer systems that include NetApp in some way. Lots of them have had Oracle databases. Big ones. Ones that needed to perform. It’s been handy knowing how Oracle works.

But before we jump into why you don’t need multiple redo log areas any more, let’s look back at why we started using them in the first place.

When Disk Was Stupid

Back in the days when disk was stupid, you needed your software to handle everything. RAID was done in software. It was slow because computers were slow.

If you were lucky enough to be able to afford RAID hardware, all it really did was handle the parity calculations and rebuilds. It moved the RAID software from your server to a disk array to make it faster. But disk was still pretty stupid.

Different kinds of RAID have different performance characteristics, and these primitive disk arrays didn’t use any of the clever techniques modern systems use to work around the problems of pure RAID-5, or double-parity RAID.

The wisdom of the time was that you didn’t use RAID5 if you had a write-intensive workload. You bit the bullet and wore the cost (it doubled!) of RAID0+1.

And disk was unreliable. You could lose a filesystem relatively easily compared with today’s systems.

Why Multiple Redo Logs?

Redo logs is another name for Oracle’s (and other databases) transaction logs. This is where each and every operation on the database is tracked, if you turn on the feature. If you INSERT INTO userdata, that statement gets logged. Since the transaction logs are used for every operation on the database, the redo log disk had to be fast.

The reason you want to track every transaction is so you can recover the database to any point in time. If the userdata table gets corrupt, or someone accidentally drops the tablespace, or inserts a million rows of garbage, you can get the database back into the state it was at the point in time just before that happened.

One method of doing this is to restore the last backup, and then replay all the transactions that happened since the backup was taken. You can do this because you logged all the transactions in your redo log area (and probably archived them off to the archivelogs area).

Which makes the redo logs pretty important. If you lose any, you lose the ability to restore back to any point covered by the lost transactions.

Since disk used to be unreliable, and redo logs are so important, what we used to do was to write them to multiple places. Usually two, sometimes three or more. That meant that even if there was a disk failure or corruption of one redo log area, the others would be safe, and we could recover the data.

There’s an obvious downside here. You had to write the same data to disk more than once, for every database transaction. That can really slow things down if the disk isn’t fast enough.

Disk Is Smart Now

NetApp disk is really smart compared with the disk arrays of yore. A NetApp Filer has a layered set of subsystems that are designed to keep the data safe, and to have it perform really well. You can stripe your data over lots of spindles for performance, and there’s a nice big battery protected memory cache before you even hit the disks. And smart software for managing the way the data is read and written.

You can turn on snapshots so you can regularly back up even your redo logs without impacting performance.

So if we were writing to multiple redo logs (slowing the database down) to guard against stupid, unreliable disk, but disk is now smart and reliable, why keep doing it?


You Don’t Have It Anyway

Chances are, even if you, as the sysadmin/DBA, see multiple redo log areas from the NetApp Filer, they’re not really separate anyway. Most designs I’ve seen have a single redo log volume, contained within a single aggregate. There might be two or more Qtrees on this volume presented to the host, but they still sit on the same bunch of disks.

You’re just slowing your database down by writing your logs twice, and paying extra for disk you don’t need.

Move With The Times

Technology changes, and the way in which you do things needs to change accordingly.

Most of us don’t use dot-matrix printers any more. We use email instead of paper-memos to talk to our colleages. We have the internet.

And we use smart disk arrays instead of stupid disk.

What once made a lot of sense doesn’t make sense any more. It’s time to update your thinking to use the new technology.

And just like email, once you do, you’ll wonder how you ever got by without it.

Bookmark the permalink.

Comments are closed.