Why it is so slow when loading?


We are having a performance issue loading our fact table.  We are only able to
process 10 records per second when processing the PL/SQL code that transforms
the fact table.

Simple facts

HP-UX 11
Oracle 8.?
2 CPUs
1G RAM
6 mirrored pairs disk
fact file = 120Meg = 400,000 rows

We load this data into a stage table first, and then use PL/SQL packages to
validate business rules and load the fact table.

We have only been able to achieve 10 records per second loading the fact
table.  The size of the file mentioned above is not relevant.  A file of 150K
rows loads just as slow.  We do not have much data in the fact table yet
either.  We just started loading.

Our PL/SQL is fairly straightforward, but does about 9 lookups into other
tables for validation. We are thinking about making the smaller table
lookups into array lookups.  Will this help?  Most of the lookups are on very
small tables (< 20 rows).  Two of the lookups are on larger tables (100K -
200K rows). Are we doing too many lookups per record?

Also, the cursor on the stage table is implicit.  Should this be explicit for
perf reasons?

We are also issuing a commit after each row.  Should we change that to every
100, or every 1000 rows?

All of the things I can think about are code related since I am not a DBA.  We
can tweak the code, but I am looking for an order of magnitude of improvement
that may not be possible tweaking code.

Does anyone have any other recommendations that may help, or any benchamrks to
compare with?  I think that 10 records a second is poor, even if we perform 9
lookups.

Other DB things we have tried
We moved data files around disk to spread i/o issue.
We have not yet used table partitioning.
We are dropping indexes on fact table.
A DBA has performed general DB tuning.



Ans1:
>We are having a performance issue loading our fact table.  We are only able to
>process 10 records per second when processing the PL/SQL code that transforms
>the fact table.
>
>Simple facts
>
>HP-UX 11
>Oracle 8.?
>2 CPUs
>1G RAM
>6 mirrored pairs disk
>fact file = 120Meg = 400,000 rows
>
>We load this data into a stage table first, and then use PL/SQL packages to
>validate business rules and load the fact table.
>
>We have only been able to achieve 10 records per second loading the fact
>table.  The size of the file mentioned above is not relevant.  A file of 150K
>rows loads just as slow.  We do not have much data in the fact table yet
>either.  We just started loading.
>
>Our PL/SQL is fairly straightforward, but does about 9 lookups into other
>tables for validation. We are thinking about making the smaller table
>lookups into array lookups.  Will this help?  Most of the lookups are on very
>small tables (< 20 rows).  Two of the lookups are on larger tables (100K -
>200K rows). Are we doing too many lookups per record?
>

can you use declaritive integrity instead of lookups?  declaritive check
constraints + foreign keys will be much faster then procedural code any day.

can you 'parallelize' the plsql processing.  can you kick off X scripts, each of
which might be able to process 10 rows/second?

have you set timed statistics to true and enabled sql_trace and used tkprof to
analyze the results (see server tuning manual for info on the above)....
 
 

>Also, the cursor on the stage table is implicit.  Should this be explicit for
>perf reasons?
>
>We are also issuing a commit after each row.  Should we change that to every
>100, or every 1000 rows?
>

make it more then 1.  go for 1000 or more.

>All of the things I can think about are code related since I am not a DBA.  We
>can tweak the code, but I am looking for an order of magnitude of improvement
>that may not be possible tweaking code.
>

you might just be able to run 10 at the same time and get the speed up.  You'll
have to "partition" your data somehow in the stage table by key or something to
give each process something to work with.



Ans2:
It is almost always a bad idea to use PL/SQL for this
type of processing.  However, 10 records per second is
very bad.

If you check the execution plans for the 9 lookups you
will probably find that a number of them are doing tablescans,
giving you perhaps several hundred comparisons per second.

Yes, one commit per row will also have an impact.
No, the difference between implicit and explicit will be insignificant

Further, if you are using a single PL/SQL cursor on the
stage table you are either single-threading, in which case
you are wasting a whole CPU, or you are multi-streaming
and without careful planning you could be getting signficant
conflicts on the inserts - particularly if your fact table is
pre-indexed.

Suggested Strategies (in probable order of cost/benefit)
1)    Make sure you have good access paths on the lookup
2)    Commit every 100 rows
3)    Try to partition the input data and run multiple streams
            but increase initrans on the fact table/indexes
            and set freelists = number of threads
4)    Look at options for:
        create intermediate table unrecoverably in parallel as
        select join between stage table and lookup tables.
        and test for lookup failures by the presence of null columns

Note:  A good strategy - if you are using a PL/SQL loop - is to
update the stage table so that you can restart the loop in case
of failure.
 

--

Jonathan Lewis
Yet another Oracle-related web site: www.jlcomp.demon.co.uk
 

Hosted by www.Geocities.ws

1