How REDO log works?


I was wondering if someone could explain why Oracle writes transactions to redo logs even though
the database is running in NOARCHIVELOG mode. If I am running in NOARCHIVELOG mode, doesn't it mean
that I don't need the logs which also should include redo logs? What good are these logs doing to me except for wasting my system resources? There must be a good reason for this. Can someone explain the logic behind this architecture.


Ans:
Separate these concepts:

-Redo log
The redo log consist of two or more log files, and contains all changes to
data blocks in the buffer cache not synced with their datafiles. In case of
a  database crash, the redo log is used to sync the datafiles with the
database state at the time of the crash. The redo log aids in recovering in
case of a disk failure. When a redo log file is full, a log switch occurs,
and the redo log is written to the next redo logfile. There are always two
or more redo logfiles.

-Archive log
Since the redo log is circular, the redo log files are overwritten
eventually. In archive log mode, a trail of redo log files is kept. An
archive is created of a redo log file before the redo log overwrites it. The
archive log aids in instance failures, being able to roll forward the
database state from the last backup to a specific (check) point in time.
Only backups (while the database is fully operational) are also possible.

In short, you need your redo logs files in all cases, because oracle needs
it to guarantee database consistency. The use of archive log mode is
optional. Benefits are recovering to point in time, and possibility for
online backups.  When you only require offline backups (shut-backup-start),
and the tape from your last offline backup is good enough in case of a
failure, you do not need archive log mode. Note that the archive log grows
unless you do something about it, like cleaning after backup.



Ans2:
Agreed that  redo logs aid recovery, but surely they are not essential. If you
accept that in the event of a failure, you will restore to a cold backup, why do
you need the redo logs. If you don't back them up and you loose everything you
can still go back to a cold back even without the redo logs. I have thought
about putting redo and rollback segments into a RAM disk to prevent the disk IO
in certain situations (e.g. Development environment). I've already tried it with
Rollbacks in RAM which seem to work OK. If it all goes nasty you just restore
and re-run. No roll forward necessary.

It would be nice if you could turn either or both off and save on the overhead.
Hence three modes. This would be particularly useful for large overnight batch
jobs where you do a cold backup at the end anyway. Because the job works 99.999%
of the time you wish to take advantage of the fact by switching off redo and
rollback and save on loads of IO.

Just my thoughts. Anyone else care to comment. Are these features available in
any other DBMS?

I still think Oracle writes twice as much data to the redo logs than it needs to
because rollback info gets logged too. But that's another story.



Ans3(Reply):

>Agreed that  redo logs aid recovery, but surely they are not essential. If you
>accept that in the event of a failure, you will restore to a cold backup, why do

The redo logs are used to recover from things like a POWER failure, software
failure, database crash, etc.  They are only "essential" if you don't want to
always have to go back to your last full cold backup after a power failure,
system failure, software failure of some sort (eg: they are pretty essential).

We log changed bytes to a large sequential file called the online redo logs.  We
buffer hundreds or thousands of changed blocks that will need to be randomly
written all over the place.  It makes the database go faster.  Thats the concept
of having redo logs -- speed and safety.

Lets say you change 100 blocks.  Block1 is in file1, block2 is in file2, and so
on (extreme case -- you've inserted 100 rows into 100 tables and changed 100
blocks on different files all over the place).  Instead of doing 100 seeks and
random writes to 'commit' your work and make it permanent -- we take the changed
bytes from your 100 updates and in one large write we write them out.  Much much
faster then 100 seeks and 100 separate writes.  If the database fails (power
goes out) right now, we have safely in the redo logs the changes you made even
though the datafiles *do not*.  When the database restarts, it will read the
online redo logs -- REDOING any work that was done before the crash.  The
database will then rollback any work that was redone but not committing and will
'commit' anything that was committed.  You lose no work.

>you need the redo logs. If you don't back them up and you loose everything you
>can still go back to a cold back even without the redo logs. I have thought
>about putting redo and rollback segments into a RAM disk to prevent the disk IO
>in certain situations (e.g. Development environment). I've already tried it with
>Rollbacks in RAM which seem to work OK. If it all goes nasty you just restore
>and re-run. No roll forward necessary.
>

don't, don't, don't -- unless you want to rebuild the whole database as you
describe.  Wouldn't recommend it.  all it takes is one shutdown abort or pulled
plug to kill your database and unless its a trivially sized database, it could
take a while to find the tapes and restore.

>It would be nice if you could turn either or both off and save on the overhead.
>Hence three modes. This would be particularly useful for large overnight batch
>jobs where you do a cold backup at the end anyway. Because the job works 99.999%
>of the time you wish to take advantage of the fact by switching off redo and
>rollback and save on loads of IO.
>

but you wouldn't be saving on loads of IO really -- if the database crashed in
the middle of a large load you would have a *real* mess on your hands.  Some of
the data would be loaded and on disk, some would be loaded but lost (since it
was in the buffer cache) and some would be not loaded at all -- you would have
*no* way of finding out.

the only time this would be useful is if you are loading an empty table or
appending lots of data.  In that case -- put the database in noarchivelog mode
and direct path load the data -- no log, no rollback just write to disk. So,
that mode does exist (but after putting the database into noarchivelog mode and
then back into archive log mode, you'll have to backup again so it may actually
not be worth it in many cases)...

>Just my thoughts. Anyone else care to comment. Are these features available in
>any other DBMS?
>
>I still think Oracle writes twice as much data to the redo logs than it needs to
>because rollback info gets logged too. But that's another story.
>

its gotta.  We roll forward from the log, back from rollback.  rollback is
logged.  Its a performance enhancement believe it or not (separate redo and
rollback).  Some databases put both ROLLBACK and REDO into their logs.  This has
a couple of implications:

- logs are both read from and written to.  They will rollback from the logs
(read) and write to the logs.  Oracle only WRITES to the online redo logs when
the database is up (well, arch will read them but only when lgwr isn't writing
them).  There is no read/write contention on logs -- no thrashing.

- logs in databases that store redo and rollback are typically managed like a
HEAP (whereas in Oracle they are managed like a stack).  Heap space management
has a high overhead when compared to a stack.  They have to use heap management
for support of long transactions (anything over a second or two) else they would
almost immediately run out of log space.  They need to micro manage the space in
the transaction logs.  Oracle managemes it like a stack.  We 'roll' logs as we
go along.  Space management is much more efficient this way.  Additionally we
have the concept of many rollback segments -- this further reduces the
contention problem for a shared resource and allows the logs to 'roll' without
getting full.  You can distribute the rollback io over many many disks and many
many users -- logs are a serial beast and reducing contention on them is a good
thing.



Ans4(Reply):
I agree with all your points about switching off redo and rollback, but still think
there are some occasions when it might be useful. I agree you must know what you're
doing and accept you will have to restore to a cold backup, but it would be nice to
have the choice. Although perhaps we already have too many choices so I'm not
compalining.

I also agree with not using a single area for roll back and redo because, as you say,
it does have advantages.

However, are you sure that you need to log rollback. I say this again because
consider the following example:

Table A column A is updated:

XXXX  to    YYYY
XXXX gets written to the rollback.
YYYY gets written to the redo.
and XXXX gets written to the redo.  (This is the bit I think is not needed)

Lets now restore the database and roll forward as we would expect.

table contains
XXXX

YYYY is read from the redo and applied to the datafile.
XXXX is read from the redo and is applied to the rollback.

Lets say this is 2 reads and 2 writes

However consider the alternative were XXXX was not written to redo log.

table contains

XXXX

YYYY is read from the redo log. Before it is applied the previous value on the
datafile is read XXXX.
XXXX is applied to the rollback.
YYYY is then applied to the datafile.

This too is two reads and two writes.

Hence we get to the same position but without rollback being written to the redo.

You could also make more optimizations like buffering up rollback so if you get to
the commit in the redo log you can throw away the rollback updates as they are not
needed. They are only needed if you stop rolling forward at a point in time before
the commit and then need to rollback.

I can so no real technical reason why this wouldn't work however, I may have missed
something really obvious in which case please say, anyone.

I know the change would not be trivial but it may save a good percentage of the redo.
Would this not be beneficial.

When I have run a test to update a 2K columns it does seem to write 4K to the redo
log. However, it is not easy to confirm this one way or the other.



Ans5(Reply):
>I agree with all your points about switching off redo and rollback, but still think
>there are some occasions when it might be useful. I agree you must know what you're
>doing and accept you will have to restore to a cold backup, but it would be nice to
>have the choice. Although perhaps we already have too many choices so I'm not
>compalining.
>
>I also agree with not using a single area for roll back and redo because, as you say,
>it does have advantages.
>
>However, are you sure that you need to log rollback. I say this again because
>consider the following example:
>
>Table A column A is updated:
>
>XXXX  to    YYYY
>XXXX gets written to the rollback.

xxxx is written to rollback segments which are buffered in the buffer cache like
all other segments (good thing they are -- consistent reads that use the
rollback segment need this).  So XXXX is not on disk generally its in the buffer
cache.

>YYYY gets written to the redo.

yyyy gets written to redo on the commit.

>and XXXX gets written to the redo.  (This is the bit I think is not needed)

this XXXX does in the same IO typically with the YYYY but the XXXX written to
rollback is still buffered (see digression in previous post about why you want
to write to redo on a commit but not do scattered IO all over the place to
datafiles)
 

>
>Lets now restore the database and roll forward as we would expect.
>
>table contains
>XXXX
>
>YYYY is read from the redo and applied to the datafile.
>XXXX is read from the redo and is applied to the rollback.
>
>Lets say this is 2 reads and 2 writes
>
>However consider the alternative were XXXX was not written to redo log.
>
>table contains
>
>XXXX
>
>YYYY is read from the redo log. Before it is applied the previous value on the
>datafile is read XXXX.
>XXXX is applied to the rollback.
>YYYY is then applied to the datafile.
>
>This too is two reads and two writes.
>

but your simple example falls apart in almost every case. The reason is
checkpointing and log switches.  Lets say you have 2-1 meg log files (2meg total
log).  You are updating 3meg of data.  updating XXXX to YYYY is the FIRST row
you updated.  It was written to redo log #1.  partway through the transaction we
switched from log #1 to log #2 -- this fires a checkpoint (so that blocks that
were modified and recorded in redo log#1 will be flushed to disk).  We now start
filling redo #2 with redo.  Bam -- system crashes.

We goto recover now.  Lets say we did not record rollback to redo as you
suggest.  Unfortunately for us -- the checkpoint that had been in progress when
we switched logfiles had a chance to flush to disk our first update from XXXX to
YYYY.  So, in your scenario:

- yyyy is read from the redo log.  before it is applied the 'previous' value on
the datafile is read YYYY   <<<<<------------- Herein lies the problem -- xxxx
no long exists in the datafile.  XXXX no longer exists anywhere in the system in
fact.

-- yyyy is applied to the rollback (since you are doing rollback from datafiles)
 

You cannot rollback anymore-- your database is broken, this transaction cannot
be rolled back anymore, nor can it be rolled forward.  You *need* to have
rollback in redo.  Rollback is buffered and datafiles are fuzzy -- you cannot
rely on the datafiles to have the before image....  Otherwise we would have to
stop all activity during a checkpoint and get a consistent picture of the
datafiles (and never have them be 'fuzzy' which they almost *always* are).

>Hence we get to the same position but without rollback being written to the redo.
>

no we couldn't... checkpoints, wrapping logfiles (eg: you have 2meg of log and
do 3meg of updates -- some of the redo disappears from the online logs)

>You could also make more optimizations like buffering up rollback so if you get to
>the commit in the redo log you can throw away the rollback updates as they are not
>needed. They are only needed if you stop rolling forward at a point in time before
>the commit and then need to rollback.
>

but they are needed for consistent reads -- ORA-1555 would happen in many many
more cases if we did that.

don't forget rollback isn't just for rolling back -- its an integral part of our
consistent read mechanism -- people read rollback (prevents LOCKs from blocking
reads so readers are not blocked by writers -- try to do a select count(*) from
T in another database when there is an outstanding UPDATE on the table -- one
thats not committed or rolledback -- it'll block).

>I can so no real technical reason why this wouldn't work however, I may have missed
>something really obvious in which case please say, anyone.
>
>I know the change would not be trivial but it may save a good percentage of the redo.
>Would this not be beneficial.
>
>When I have run a test to update a 2K columns it does seem to write 4K to the redo
>log. However, it is not easy to confirm this one way or the other.
>

redo of the data (2k) + redo of the rollback (2k) = 4k done in one large
sequential write as opposed to randon seeks and writes.  Multiply your single
session by dozens or hundreds and the payoff is there big time.



Ans6:
In a development environment the redo logs should not be slowing things
down.  Typically, a development environment is not as transaction intensive
as a production one (there are exceptions of course).
I would like to see a way to mark a table as not recoverable or not logged.
There are tables that we have that hold temporary information and if the
instance went down the information is not used.

I think it is a design choice and I cannot think of another commercial RDBMS
that allows you to basically make the database unrecoverable.  Can you
imagine competitor X advertising that company Y 's database is
unrecoverable.  Also they would have to make some major architecture changes
and that also probably drives it.
Jim



Ans7:
If you pull the plug on the machine or the OS kernal crashes, then Oracle is
able to retore any in doubt transactions to the point of failure.  Without
the redo logs, any crash or "shutdown abort" would require restoring to the
last cold backup.  The archive logs provide for a media failure recovery.  A
lost (delted) database file can be reovered with the archive logging.  The
redo logs provide for recovery from "smaller" failures.

Please note that by putting the redo logs in RAM disk, a power flicker or OS
crash would put you into media failure mode (a database file would be lost).
Recovery becomes more complex.

James



Ans8:
Actually, in a way, you do have a choice, but Oracle makes you select this
option for every table and index you may want to create.  If a table is not so
important that you wish to write to the redo logs, you can create it with the
NOLOGGING option, which will bypass writing to the redo logs.


Ans9:
Mike,

So you always restore your last cold backup on a power failure? I attempt to
start the database, see it roll back uncommitted transactions, an it is up
and running. Remember that uncommitted transactions are written to the
datafiles directly. Your RAM-disk redo-logs are gone. While you are reading
tapes, I am going home and enjoy my weekend.

I agree that backing up the redo logs is useless. Using them is useful.

About running batch jobs...when you change data on robust multi-user
databases, expect IO's. Use your OS concurrency on disk access paths and
spread your datafiles among multiple physical disks. Use multiple dbwriters
(1 per used physical disk).  Agree that IO's are bad news on any system, not
that they can and be avoided
 

>I still think Oracle writes twice as much data to the redo logs than it
>needs to
>because rollback info gets logged too. But that's another story.

What do you mean by 'rollback info gets logged too' ?
Consider:
User a modifies but doesn't  commit
User b reads data before User a changed it.

You try to implement this without storing the before state of a transaction
in a multi-user database.

JM



Ans10:
As far as performance is concerned I was thinking about large inserts which need
to be done via a program. Inserting 100MB of data to the datafiles means 100MB
being written to the rollback and 200MB being written to the redo log. In
development on a single disk this must slow things down a bit. Especially if the
100MB to the data files is almost sequential. I agree that there are many other
cases when the overhead is minimal.

I was thinking that if the facility existed it would be an advanced option. i.e.
You use it if you know what you are doing (and perhaps a bit mad) like many
other options. I have no doubts that using logs in no archive log mode is
defiantly the best default choice, but like most developers we like to have the
choice:-)  Don't worry, it's not on my top 100 list.

As for the advertising, well if they said that, it would be the users choice. It
would not be Oracles fault.

I'm surprised it would be a major architectural change. I thought you could just
miss out the writes to redo and rollback. The rest, i.e., checking, must already
by present. So in short it would be:

if flag_set
  write redo
end

I am being slightly flippant with the above code. I'm sure any change to Oracle
database is a significant change. I know there is a lot more to it.
 
 
 
 

Hosted by www.Geocities.ws

1