How to normalize tables?

Question:

I have to create a table for the following:
- 100 000 products
- each product has 10 descriptions
- each description must be in 30 languages

If I create one table to contain all of the properties of the products,
I'll have at least PRODUCT_ID + 30*10 = 301 columns.

An other choice can be to use two tables:
- table PRODUCTS with PRODUCT_ID and other columns
- table DESCRIPTIONS with the columns
      - PRODUCT_ID
      - DESCRIPTION_ID
      - DESCRIPTION_LANGUAGE
      - DESCRIPTION

The table DESCRIPTIONS will contain 100 000*10*30 = 30 000 000 rows

Is there a best way ?

Thanks for any help.

Answer 1:

I would probably break it up one more:
[products table]
products_id
product description

[language table]
language_id
language description

[description table]
product_id
language_id
description_id
description_text

Don't sweat the 30 million rows.  You're not the first person to have a big
table.  In fact, if your descriptions are short, you might have indexes
larger than you description table.

--rn.

Answer 2:

another choice:

products ( product_id PRIMARY KEY, and other columns )

descriptions ( product_id REFERENCES products, description_language,
description_1, description_2, ...., description_10, primary key( product_id,
description_language ) )
 

Since you'll always access descriptions via the primary key index on product_id,
description_language (eg: your query is always:

select * from products, description
where products.product_id = description.product_id
and description.language_id = :the_language_of_choice;

) it won't matter much if there are 3million, 30million or more records in the
second table.

I like this approach better as it allows you to easily add and drop languages --
no structural changes needed.

Your first 2 table design is nice if you need to add/remove some description
fields over time but is a little harder to query.

Answer 4:

Personally - I'd choose option B because if next year, your company started
selling products to Martians and Martian became a language to support on
your database, then you'd have to do an ALTER TABLE just to add it. Option
B allows you to add a gazillion languages to the database without a DBA
having to be involved with an application data entry issue (disk space
however, could possibly be an issue ...).

However - option B adds about 30-40 Meg more data to represent the extra
product ID/ language ID (assuming about 10bytes to represent the product
id) . But what the heck ... space is cheap ... and your database will be
normalized better.

- Troy
 

Hosted by www.Geocities.ws

1