dbTalk Databases Forums  

Denormalization

comp.databases comp.databases


Discuss Denormalization in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
AT
 
Posts: n/a

Default Denormalization - 06-30-2006 , 03:58 PM






This week I came across the same construct the second time, and I am
curious whether it is good practice or not.

A huge table was factored according to the values in one column, and
the value was used as a part of the new table name.

Example:

Table INFO has columns C1, C2, C3.

The values in column C1 are "a", "b", "c",..."z"

Create tables INFOa, INFOb, INFOc, .. INFOz with columns C2, C3, of the
same type as in table INFO. Table INFOa has the same rows as table INFO
where the C1 column has value "a", except the C1 column itself is
dropped. And so on for tables INFOb, INFOc,...INFOz.

SELECT and UPDATE statements where then constructed on the fly with
string concatenation

"SELECT C2, C3 from INFO${param} where C2 > 5"

for example.

Table INFO is not actually present in the database.

Now I don't know what to call this, maybe out-of-band-denormalization.

The two databases where mysql based. Size 400MB for all the INFO[a-z]
tables in one case, 20MB in the other case.

So I can see this might lead to performance gains. Is this a common
optimization approach? Is it performed with Oracle / PostgreSQL
databases ? I would have expected this can be handled better in some
other way; maybe with views.

Thanks

Stephan


Reply With Quote
  #2  
Old   
Bill Karwin
 
Posts: n/a

Default Re: Denormalization - 06-30-2006 , 04:16 PM






stephanwehner (AT) gmail (DOT) com wrote:
Quote:
So I can see this might lead to performance gains. Is this a common
optimization approach? Is it performed with Oracle / PostgreSQL
databases ? I would have expected this can be handled better in some
other way; maybe with views.
You might want to read about partitioning, a new feature coming in MySQL
5.1 (still in beta). Some other RDBMS implementations have similar
features, but since you're using MySQL, this might be of interest to you.

http://dev.mysql.com/doc/refman/5.1/...-overview.html

It could achieve the benefits you are seeking, by splitting the physical
storage without splitting the logical table entity.

Regards,
Bill K.


Reply With Quote
  #3  
Old   
AT
 
Posts: n/a

Default Re: Denormalization - 06-30-2006 , 07:20 PM



Hi Bill,
Thanks for the link.

I take it then that is not out-of-the-ordinary to split a table in this
manner.
I was trying to understand how this is dealt with in other database
engines as well. Views don't allow this kind of optimization?

Stephan

Bill Karwin wrote:
Quote:
stephanwehner (AT) gmail (DOT) com wrote:
So I can see this might lead to performance gains. Is this a common
optimization approach? Is it performed with Oracle / PostgreSQL
databases ? I would have expected this can be handled better in some
other way; maybe with views.

You might want to read about partitioning, a new feature coming in MySQL
5.1 (still in beta). Some other RDBMS implementations have similar
features, but since you're using MySQL, this might be of interest to you.

http://dev.mysql.com/doc/refman/5.1/...-overview.html

It could achieve the benefits you are seeking, by splitting the physical
storage without splitting the logical table entity.

Regards,
Bill K.


Reply With Quote
  #4  
Old   
David Portas
 
Posts: n/a

Default Re: Denormalization - 07-01-2006 , 02:46 AM



stephanwehner (AT) gmail (DOT) com wrote:
Quote:
Hi Bill,
Thanks for the link.

I take it then that is not out-of-the-ordinary to split a table in this
manner.
I was trying to understand how this is dealt with in other database
engines as well. Views don't allow this kind of optimization?

Stephan

It's a technique common to DBMSs that don't have better optimization
features. Most enterprise DBMSs do allow physical partitioning without
creating new tables. In those systems that don't you may be forced to
create new tables ("logical partitioning") as a workaround. I don't
know MySQL so I don't know if it has better ways to do this.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--



Reply With Quote
  #5  
Old   
Bill Karwin
 
Posts: n/a

Default Re: Denormalization - 07-01-2006 , 03:18 AM



stephanwehner (AT) gmail (DOT) com wrote:
Quote:
I take it then that is not out-of-the-ordinary to split a table in this
manner.
No -- I don't think it's generally recommended to manually split tables
like you describe.

MySQL should be able to handle hundreds of thousands, or millions of
rows on a properly tuned server. Read the chapter of the MySQL manual
on Optimization, it has many tips on how to tune things. But nowhere
does it say, "split up your table".

Another feature that you might read up on is the MERGE storage engine.
Though this is usually used to enable very large tables that exceed
physical limits of file storage. It's not really intended to be a
performance enhancement.

Quote:
I was trying to understand how this is dealt with in other database
engines as well. Views don't allow this kind of optimization?
No, views rely on the storage of their underlying tables.

Regards,
Bill K.


Reply With Quote
  #6  
Old   
AT
 
Posts: n/a

Default Re: Denormalization - 07-02-2006 , 01:07 AM



Ok, thanks all for the help!

Stephan


Reply With Quote
  #7  
Old   
David Cressey
 
Posts: n/a

Default Re: Denormalization - 07-02-2006 , 12:23 PM




<stephanwehner (AT) gmail (DOT) com> wrote

Quote:
This week I came across the same construct the second time, and I am
curious whether it is good practice or not.

A huge table was factored according to the values in one column, and
the value was used as a part of the new table name.

Example:

Table INFO has columns C1, C2, C3.

The values in column C1 are "a", "b", "c",..."z"

Create tables INFOa, INFOb, INFOc, .. INFOz with columns C2, C3, of the
same type as in table INFO. Table INFOa has the same rows as table INFO
where the C1 column has value "a", except the C1 column itself is
dropped. And so on for tables INFOb, INFOc,...INFOz.

SELECT and UPDATE statements where then constructed on the fly with
string concatenation

"SELECT C2, C3 from INFO${param} where C2 > 5"

for example.

Table INFO is not actually present in the database.

Now I don't know what to call this, maybe out-of-band-denormalization.

The two databases where mysql based. Size 400MB for all the INFO[a-z]
tables in one case, 20MB in the other case.

So I can see this might lead to performance gains. Is this a common
optimization approach? Is it performed with Oracle / PostgreSQL
databases ? I would have expected this can be handled better in some
other way; maybe with views.

Thanks

Stephan

The "goodness" or "badness" of any given design depends on the design goals
specific to the project.

Having said this, the above design will cause a host of problems and bring
almost no benefits in many situations.

If the goal is to PHYSICALLY separate the data from the different
constituent tables, rather than to LOGICALLY separate said data, then
there is a much more attractive alternative, called partitioning a table.
New rows can be stored in different partitions, depending on the value of
the first column in your example (so called "horizontal partitioning.")

With a really good DBMS, you'll get almost all the benefits of using
separate tables, with none of the drawbacks.

The biggest source of problems is the fact that column 1, which was data in
the original design, has now been represented as metadata in your proposed
design. This almost always causes problems.






Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.