Presents your 
SQL SERVER E-NEWSLETTER for October 22, 2002

<------------------------------------------->

Achieve better performance in LOOKUP tables

Lookup tables that have a relatively small set of values and are static
should potentially use the alternate key vs. the surrogate key as the
primary key to achieve better performance. Potential benefits are determined
by analyzing access paths or data usage. The surrogate key of a table 
is the value being generated by the DBMS system. For SQL Server, that may
be the IDENTITY property of the primary key. The alternate key may be 
the CHAR(6) code column of the table. The alternate key is constrained by a
unique constraint.

Access path or data usage analysis is analyzing the queries' access to
the database's data. If 80 percent of the queries are joining from one 
or several large table(s) to the lookup table(s) and have the alternate 
key as the filtering mechanism in the WHERE clause, then the lookup table 
is a potential candidate.

Another aspect of the access paths or usage analysis to the data is the
frequency of these queries. If the same 80 percent of the queries are
being run at a high rate, at least significantly higher than the 
remaining 20 percent of the queries, then the lookup table is a probable 
candidate for using the alternate key.

If the queries have only the alternate key as the filtering mechanism 
in the WHERE clause, you can increase performance by:

1. Moving the alternate key (the code column with data type CHAR(6)) to
the primary key and eliminating the surrogate key.

2. Creating foreign keys from the lookup table into the large tables
using the code column with data type CHAR(6).

3. Adding an index on the foreign key.

You don't need to join from the large table to the small lookup table
via the surrogate key to get to the code column. You already have the 
code column as the foreign key in the large table. As a result, you 
eliminate the join, which is a performance gain, and the filtering clause 
on the code column that is indexed should result in an index seek.

However, if the large tables only have a few thousand rows, you may not
see the performance gain. It may be that the performance gain will not 
be realized until the large table has several million rows. Following is a
script for a simple illustration of this scenario.

-- should insert 1000 rows for testing
CREATE TABLE LookUp
(lupkey1 int IDENTITY
    CONSTRAINT pk_LookUp PRIMARY KEY,
LuCode CHAR(6) NOT NULL
    CONSTRAINT ak_LookUp UNIQUE,
LuCodeDescr VARCHAR(80) NOT NULL
    CONSTRAINT dflt_LuCodeDescr DEFAULT 'UNKNOWN')
GO

-- should insert several million rows for  testing
CREATE TABLE BigTable
(pkey1 INT IDENTITY
    CONSTRAINT pk_BigTable PRIMARY KEY,
lupkey1 int NOT NULL
    CONSTRAINT fk_LookUpBigTable FOREIGN KEY
    REFERENCES LookUp (lupkey1),
Col1 int NULL)
GO

SELECT bt.*
FROM BigTable bt
JOIN LookUp lu on bt.lupkey1 = lu.lupkey1
WHERE lu.LuCode = 'TEST1'
GO
DROP TABLE BigTable, LookUp
GO

CREATE TABLE LookUp
(LuCode CHAR(6) NOT NULL
    CONSTRAINT pk_LookUp PRIMARY KEY,
LuCodeDescr VARCHAR(80) NOT NULL
    CONSTRAINT dflt_LuCodeDescr DEFAULT 'UNKNOWN')
GO

CREATE TABLE BigTable
(pkey1 INT IDENTITY
    CONSTRAINT pk_BigTable PRIMARY KEY,
LuCode CHAR(6) NOT NULL
    CONSTRAINT fk_LookUpBigTable FOREIGN KEY
    REFERENCES LookUp (LuCode),
Col1 int NULL)
GO
CREATE NONCLUSTERED INDEX idx_fkBigTableLuCode ON BigTAble (LuCode)
GO

SELECT bt.*
FROM BigTable bt
WHERE bt.LuCode = 'TEST1'
GO
DROP TABLE BigTable, LookUp
GO

----------------------------------------