Question:
Hi, I have this query that takes forever.
SELECT R.REQUEST_TYPE_CODE,
RH.STAGE_CODE,
RH.USER_FULL_NAME,
to_char(COUNT(RH.USER_FULL_NAME))
FROM table1 R,
table2 RH
WHERE R.REQUEST_ID = RH.REQUEST_ID
AND RH.GROUP_CODE = 'RETAIL'
AND RH.ACTION_CODE IN ('One', 'Two ','Three', 'Four')
AND RH.STAGE_CODE IN ('Yo','Bro', 'Dude', 'hello', 'whatever')
AND RH.USER_STATE = 'Hell'
AND RH.REQST_HIST_DATE >= to_date('13/12/1999', 'DD/MM/YYYY')
AND RH.REQST_HIST_DATE < to_date('14/01/2000', 'DD/MM/YYYY')
GROUP BY R.REQUEST_TYPE_CODE,
RH.STAGE_CODE,
RH.USER_FULL_NAME
ORDER BY R.REQUEST_TYPE_CODE,
RH.STAGE_CODE DESC,
RH.USER_FULL_NAME
If I take out the group and order by clause it speeds it up to less
than 10 secs, otherwise it takes 10 mins +.
there is and index on reqst_hist_date on table2. table 2 contains
around 800,000 rows as with table 1.
Anyone with ideas on how this can be sped up. Any help would be
appreciated.
Answer 1:
I can see why it is taking long.
The actual query isn't taking long, its the SORTING its doing!
I remember this, i had a similar report which I ended up
rewriting using PL/SQL and temp tables. Flew life dung off a
shovel afterwards!
You are first doing a group by which will do an implicit sort in
the background.
You then have an order by clause which will take that sorted
data and resort it again just to get your data in stage_code
descending order after the requst_type_code.
Dunno what you can do about it though apart from rewriting your
code to avoid one of the sorts.
(it would have been nice if Oracle supported group by XX desc,
you wouldn't have a problem then)
Answer 2:
First, why are you using TO_CHAR on the count results? I can see
no
reason for this conversion.
My suggestion is to run EXPLAIN PLAN on this query and find out just
what the optimizer is doing to return your results. You will
first
need to run $ORACLE_HOME/rdbms/admin/utlxplan.sql to create PLAN_TABLE
for the results. Then take your query and perform an explain
plan on
it:
explain plan
set statement_id = 'retail' for
SELECT R.REQUEST_TYPE_CODE,
RH.STAGE_CODE,
RH.USER_FULL_NAME,
to_char(COUNT(RH.USER_FULL_NAME))
FROM table1 R,
table2 RH
WHERE R.REQUEST_ID = RH.REQUEST_ID
AND RH.GROUP_CODE = 'RETAIL'
AND RH.ACTION_CODE IN ('One', 'Two ','Three', 'Four')
AND RH.STAGE_CODE IN ('Yo','Bro', 'Dude', 'hello', 'whatever')
AND RH.USER_STATE = 'Hell'
AND RH.REQST_HIST_DATE >= to_date('13/12/1999', 'DD/MM/YYYY')
AND RH.REQST_HIST_DATE < to_date('14/01/2000', 'DD/MM/YYYY')
GROUP BY R.REQUEST_TYPE_CODE,
RH.STAGE_CODE,
RH.USER_FULL_NAME
ORDER BY R.REQUEST_TYPE_CODE,
RH.STAGE_CODE DESC,
RH.USER_FULL_NAME;
You should see the following output:
Explained.
Use the following query to return and format the output from the
EXPLAIN PLAN action:
select LPAD(' ',2*LEVEL)||Operation||' '||Options||' '||Object_name
Q_Plan
from PLAN_TABLE
where statement_id = 'retail'
connect by prior ID = Parent_ID and statement_id = 'retail'
start with ID=0;
This should provide an accounting of the optimizer execution path for
this query. Having that you should be able to determine what
is
slowing down the query; query tuning can then commence by altering
the
query and executing another EXPLAIN PLAN action on it (be sure to
change the statement_id value, or delete from PLAN_TABLE where the
statement_id is 'retail' so you don't get a jumble of results from
the
PLAN_TABLE query).
One possibility to explore is to install and enable the parallel query
option for this database. If the option is not installed (you
will
know by the message reported by SQL*Plus as you connect to the
database) you or the DBA will need to install it. Some init.ora
parameters must be set to enable parallel query; the DBA will know
what
to set and how to configure the options. After that, enabling
parallel
query is easy:
alter table <tablename> parallel;
Do this for each table in your query. Run the EXPLAIN PLAN query
listed above again, after parallel query is enabled and see if the
execution path has been altered (you should see a different plan).
That should decrease the time to return results, since there will be
multiple query servers extracting the data from the database.
The data
from the multiple queries will be concatenated into the final result.
I realize that this is quite a bit to digest, but I believe that this
will help in the long run.