-Redo log
The redo log consist of two or more log files, and contains all changes
to
data blocks in the buffer cache not synced with their datafiles. In
case of
a database crash, the redo log is used to sync the datafiles
with the
database state at the time of the crash. The redo log aids in recovering
in
case of a disk failure. When a redo log file is full, a log switch
occurs,
and the redo log is written to the next redo logfile. There are always
two
or more redo logfiles.
-Archive log
Since the redo log is circular, the redo log files are overwritten
eventually. In archive log mode, a trail of redo log files is kept.
An
archive is created of a redo log file before the redo log overwrites
it. The
archive log aids in instance failures, being able to roll forward the
database state from the last backup to a specific (check) point in
time.
Only backups (while the database is fully operational) are also possible.
In short, you need your redo logs files in all cases, because oracle
needs
it to guarantee database consistency. The use of archive log mode is
optional. Benefits are recovering to point in time, and possibility
for
online backups. When you only require offline backups (shut-backup-start),
and the tape from your last offline backup is good enough in case of
a
failure, you do not need archive log mode. Note that the archive log
grows
unless you do something about it, like cleaning after backup.
It would be nice if you could turn either or both off and save on the
overhead.
Hence three modes. This would be particularly useful for large overnight
batch
jobs where you do a cold backup at the end anyway. Because the job
works 99.999%
of the time you wish to take advantage of the fact by switching off
redo and
rollback and save on loads of IO.
Just my thoughts. Anyone else care to comment. Are these features available
in
any other DBMS?
I still think Oracle writes twice as much data to the redo logs than
it needs to
because rollback info gets logged too. But that's another story.
>Agreed that redo logs aid recovery, but surely they are not essential.
If you
>accept that in the event of a failure, you will restore to a cold
backup, why do
The redo logs are used to recover from things like a POWER failure,
software
failure, database crash, etc. They are only "essential" if you
don't want to
always have to go back to your last full cold backup after a power
failure,
system failure, software failure of some sort (eg: they are pretty
essential).
We log changed bytes to a large sequential file called the online redo
logs. We
buffer hundreds or thousands of changed blocks that will need to be
randomly
written all over the place. It makes the database go faster.
Thats the concept
of having redo logs -- speed and safety.
Lets say you change 100 blocks. Block1 is in file1, block2 is
in file2, and so
on (extreme case -- you've inserted 100 rows into 100 tables and changed
100
blocks on different files all over the place). Instead of doing
100 seeks and
random writes to 'commit' your work and make it permanent -- we take
the changed
bytes from your 100 updates and in one large write we write them out.
Much much
faster then 100 seeks and 100 separate writes. If the database
fails (power
goes out) right now, we have safely in the redo logs the changes you
made even
though the datafiles *do not*. When the database restarts, it
will read the
online redo logs -- REDOING any work that was done before the crash.
The
database will then rollback any work that was redone but not committing
and will
'commit' anything that was committed. You lose no work.
>you need the redo logs. If you don't back them up and you loose everything
you
>can still go back to a cold back even without the redo logs. I have
thought
>about putting redo and rollback segments into a RAM disk to prevent
the disk IO
>in certain situations (e.g. Development environment). I've already
tried it with
>Rollbacks in RAM which seem to work OK. If it all goes nasty you just
restore
>and re-run. No roll forward necessary.
>
don't, don't, don't -- unless you want to rebuild the whole database
as you
describe. Wouldn't recommend it. all it takes is one shutdown
abort or pulled
plug to kill your database and unless its a trivially sized database,
it could
take a while to find the tapes and restore.
>It would be nice if you could turn either or both off and save on the
overhead.
>Hence three modes. This would be particularly useful for large overnight
batch
>jobs where you do a cold backup at the end anyway. Because the job
works 99.999%
>of the time you wish to take advantage of the fact by switching off
redo and
>rollback and save on loads of IO.
>
but you wouldn't be saving on loads of IO really -- if the database
crashed in
the middle of a large load you would have a *real* mess on your hands.
Some of
the data would be loaded and on disk, some would be loaded but lost
(since it
was in the buffer cache) and some would be not loaded at all -- you
would have
*no* way of finding out.
the only time this would be useful is if you are loading an empty table
or
appending lots of data. In that case -- put the database in noarchivelog
mode
and direct path load the data -- no log, no rollback just write to
disk. So,
that mode does exist (but after putting the database into noarchivelog
mode and
then back into archive log mode, you'll have to backup again so it
may actually
not be worth it in many cases)...
>Just my thoughts. Anyone else care to comment. Are these features available
in
>any other DBMS?
>
>I still think Oracle writes twice as much data to the redo logs than
it needs to
>because rollback info gets logged too. But that's another story.
>
its gotta. We roll forward from the log, back from rollback.
rollback is
logged. Its a performance enhancement believe it or not (separate
redo and
rollback). Some databases put both ROLLBACK and REDO into their
logs. This has
a couple of implications:
- logs are both read from and written to. They will rollback from
the logs
(read) and write to the logs. Oracle only WRITES to the online
redo logs when
the database is up (well, arch will read them but only when lgwr isn't
writing
them). There is no read/write contention on logs -- no thrashing.
- logs in databases that store redo and rollback are typically managed
like a
HEAP (whereas in Oracle they are managed like a stack). Heap
space management
has a high overhead when compared to a stack. They have to use
heap management
for support of long transactions (anything over a second or two) else
they would
almost immediately run out of log space. They need to micro manage
the space in
the transaction logs. Oracle managemes it like a stack.
We 'roll' logs as we
go along. Space management is much more efficient this way.
Additionally we
have the concept of many rollback segments -- this further reduces
the
contention problem for a shared resource and allows the logs to 'roll'
without
getting full. You can distribute the rollback io over many many
disks and many
many users -- logs are a serial beast and reducing contention on them
is a good
thing.
I also agree with not using a single area for roll back and redo because,
as you say,
it does have advantages.
However, are you sure that you need to log rollback. I say this again
because
consider the following example:
Table A column A is updated:
XXXX to YYYY
XXXX gets written to the rollback.
YYYY gets written to the redo.
and XXXX gets written to the redo. (This is the bit I think is
not needed)
Lets now restore the database and roll forward as we would expect.
table contains
XXXX
YYYY is read from the redo and applied to the datafile.
XXXX is read from the redo and is applied to the rollback.
Lets say this is 2 reads and 2 writes
However consider the alternative were XXXX was not written to redo log.
table contains
XXXX
YYYY is read from the redo log. Before it is applied the previous value
on the
datafile is read XXXX.
XXXX is applied to the rollback.
YYYY is then applied to the datafile.
This too is two reads and two writes.
Hence we get to the same position but without rollback being written to the redo.
You could also make more optimizations like buffering up rollback so
if you get to
the commit in the redo log you can throw away the rollback updates
as they are not
needed. They are only needed if you stop rolling forward at a point
in time before
the commit and then need to rollback.
I can so no real technical reason why this wouldn't work however, I
may have missed
something really obvious in which case please say, anyone.
I know the change would not be trivial but it may save a good percentage
of the redo.
Would this not be beneficial.
When I have run a test to update a 2K columns it does seem to write
4K to the redo
log. However, it is not easy to confirm this one way or the other.
xxxx is written to rollback segments which are buffered in the buffer
cache like
all other segments (good thing they are -- consistent reads that use
the
rollback segment need this). So XXXX is not on disk generally
its in the buffer
cache.
>YYYY gets written to the redo.
yyyy gets written to redo on the commit.
>and XXXX gets written to the redo. (This is the bit I think is not needed)
this XXXX does in the same IO typically with the YYYY but the XXXX written
to
rollback is still buffered (see digression in previous post about why
you want
to write to redo on a commit but not do scattered IO all over the place
to
datafiles)
>
>Lets now restore the database and roll forward as we would expect.
>
>table contains
>XXXX
>
>YYYY is read from the redo and applied to the datafile.
>XXXX is read from the redo and is applied to the rollback.
>
>Lets say this is 2 reads and 2 writes
>
>However consider the alternative were XXXX was not written to redo
log.
>
>table contains
>
>XXXX
>
>YYYY is read from the redo log. Before it is applied the previous
value on the
>datafile is read XXXX.
>XXXX is applied to the rollback.
>YYYY is then applied to the datafile.
>
>This too is two reads and two writes.
>
but your simple example falls apart in almost every case. The reason
is
checkpointing and log switches. Lets say you have 2-1 meg log
files (2meg total
log). You are updating 3meg of data. updating XXXX to YYYY
is the FIRST row
you updated. It was written to redo log #1. partway through
the transaction we
switched from log #1 to log #2 -- this fires a checkpoint (so that
blocks that
were modified and recorded in redo log#1 will be flushed to disk).
We now start
filling redo #2 with redo. Bam -- system crashes.
We goto recover now. Lets say we did not record rollback to redo
as you
suggest. Unfortunately for us -- the checkpoint that had been
in progress when
we switched logfiles had a chance to flush to disk our first update
from XXXX to
YYYY. So, in your scenario:
- yyyy is read from the redo log. before it is applied the 'previous'
value on
the datafile is read YYYY <<<<<-------------
Herein lies the problem -- xxxx
no long exists in the datafile. XXXX no longer exists anywhere
in the system in
fact.
-- yyyy is applied to the rollback (since you are doing rollback from
datafiles)
You cannot rollback anymore-- your database is broken, this transaction
cannot
be rolled back anymore, nor can it be rolled forward. You *need*
to have
rollback in redo. Rollback is buffered and datafiles are fuzzy
-- you cannot
rely on the datafiles to have the before image.... Otherwise
we would have to
stop all activity during a checkpoint and get a consistent picture
of the
datafiles (and never have them be 'fuzzy' which they almost *always*
are).
>Hence we get to the same position but without rollback being written
to the redo.
>
no we couldn't... checkpoints, wrapping logfiles (eg: you have 2meg
of log and
do 3meg of updates -- some of the redo disappears from the online logs)
>You could also make more optimizations like buffering up rollback so
if you get to
>the commit in the redo log you can throw away the rollback updates
as they are not
>needed. They are only needed if you stop rolling forward at a point
in time before
>the commit and then need to rollback.
>
but they are needed for consistent reads -- ORA-1555 would happen in
many many
more cases if we did that.
don't forget rollback isn't just for rolling back -- its an integral
part of our
consistent read mechanism -- people read rollback (prevents LOCKs from
blocking
reads so readers are not blocked by writers -- try to do a select count(*)
from
T in another database when there is an outstanding UPDATE on the table
-- one
thats not committed or rolledback -- it'll block).
>I can so no real technical reason why this wouldn't work however, I
may have missed
>something really obvious in which case please say, anyone.
>
>I know the change would not be trivial but it may save a good percentage
of the redo.
>Would this not be beneficial.
>
>When I have run a test to update a 2K columns it does seem to write
4K to the redo
>log. However, it is not easy to confirm this one way or the other.
>
redo of the data (2k) + redo of the rollback (2k) = 4k done in one large
sequential write as opposed to randon seeks and writes. Multiply
your single
session by dozens or hundreds and the payoff is there big time.
I think it is a design choice and I cannot think of another commercial
RDBMS
that allows you to basically make the database unrecoverable.
Can you
imagine competitor X advertising that company Y 's database is
unrecoverable. Also they would have to make some major architecture
changes
and that also probably drives it.
Jim
Please note that by putting the redo logs in RAM disk, a power flicker
or OS
crash would put you into media failure mode (a database file would
be lost).
Recovery becomes more complex.
James
So you always restore your last cold backup on a power failure? I attempt
to
start the database, see it roll back uncommitted transactions, an it
is up
and running. Remember that uncommitted transactions are written to
the
datafiles directly. Your RAM-disk redo-logs are gone. While you are
reading
tapes, I am going home and enjoy my weekend.
I agree that backing up the redo logs is useless. Using them is useful.
About running batch jobs...when you change data on robust multi-user
databases, expect IO's. Use your OS concurrency on disk access paths
and
spread your datafiles among multiple physical disks. Use multiple dbwriters
(1 per used physical disk). Agree that IO's are bad news on any
system, not
that they can and be avoided
>I still think Oracle writes twice as much data to the redo logs than
it
>needs to
>because rollback info gets logged too. But that's another story.
What do you mean by 'rollback info gets logged too' ?
Consider:
User a modifies but doesn't commit
User b reads data before User a changed it.
You try to implement this without storing the before state of a transaction
in a multi-user database.
JM
I was thinking that if the facility existed it would be an advanced
option. i.e.
You use it if you know what you are doing (and perhaps a bit mad) like
many
other options. I have no doubts that using logs in no archive log mode
is
defiantly the best default choice, but like most developers we like
to have the
choice:-) Don't worry, it's not on my top 100 list.
As for the advertising, well if they said that, it would be the users
choice. It
would not be Oracles fault.
I'm surprised it would be a major architectural change. I thought you
could just
miss out the writes to redo and rollback. The rest, i.e., checking,
must already
by present. So in short it would be:
if flag_set
write redo
end
I am being slightly flippant with the above code. I'm sure any change
to Oracle
database is a significant change. I know there is a lot more to it.