dbTalk Databases Forums  

small m-to-n tables

comp.databases comp.databases


Discuss small m-to-n tables in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
metaperl
 
Posts: n/a

Default small m-to-n tables - 02-02-2009 , 10:58 AM






let's say you have a person table... and a person can have multiple
email_types (home, work) multiple phone_types (work, cell, alternate).
and of course any type can relate to multiple people.

Now, the most error prone and least useful way to model this is to
create enumerated type columns in the person table... it is error
prone because the actual email table will join with the people table
on this enumerated field and it has to be kept in sync in 2 places.
not only that, but it is more difficult to get a list of email_types
for front-end rendering this way.

The 'standard' way to do this is to:

(1) create an email_types table with 2 columns: id and the name of the
email_type
(2) create a person table with a column indicated preferred email
(3) create a person_email table which has all email values for that
person.

Now , my co-workers are up-in arms about table #1 above... they think
it is counter-productive to have a table with only 2 or 3 items in it.

The compromise has been a 'types_warehouse' table, which would lump
together the email_types and phone_types into a table like so:

CREATE TABLE types_warehose
INTEGER id
VARCHAR(255) typeof_type
VARCHAR(255) type_name


and this table would have entries like:

(1, 'email', 'home')
(2, 'email', 'work')
(3, 'phone', 'cell')
(4, 'phone', 'alternate')


Do you have any feedback on the 3 options for modelling m-to-n
relations? I am in favor of the 'standard' way. Is there a particular
online or book reference that would argue for/against any of these
approaches?


Reply With Quote
  #2  
Old   
Ed Prochak
 
Posts: n/a

Default Re: small m-to-n tables - 02-02-2009 , 11:31 AM






On Feb 2, 11:58*am, metaperl <metap... (AT) gmail (DOT) com> wrote:
Quote:
let's say you have a person table... and a person can have multiple
email_types (home, work) multiple phone_types (work, cell, alternate).
and of course any type can relate to multiple people.

Now, the most error prone and least useful way to model this is to
create enumerated type columns in the person table... it is error
prone because the actual email table will join with the people *table
on this enumerated field and it has to be kept in sync in 2 places.
not only that, but it is more difficult to get a list of email_types
for front-end rendering this way.

The 'standard' way to do this is to:

(1) create an email_types table with 2 columns: id and the name of the
email_type
(2) create a person table with a column indicated preferred email
(3) create a person_email table which has all email values for that
person.

Now , my co-workers are up-in arms about table #1 above... they think
it is counter-productive to have a table with only 2 or 3 items in it.

The compromise has been a 'types_warehouse' table, which would lump
together the email_types and phone_types into a table like so:

CREATE TABLE types_warehose
* INTEGER id
* VARCHAR(255) typeof_type
* VARCHAR(255) type_name

and this table would have entries like:

(1, 'email', 'home')
(2, 'email', 'work')
(3, 'phone', 'cell')
(4, 'phone', 'alternate')

Do you have any feedback on the 3 options for modelling m-to-n
relations? I am in favor of the 'standard' way. Is there a particular
online or book reference that would argue for/against any of these
approaches?
NNNOoooooo!

An entity is an entity no matter how small.

The merged table you describe can work, but it means long ramp up for
new employees maintaining the system. Why, because that table becomes
the kitchen odds and ends drawer, a place called OTHER.

Advantages of the many small specific tables:
Every table represents one thing, so easy to learn and understand.
Here you have two small tables: PHONE_TYPE and EMAIL_TYPE.
Easy to maintain. A simple INSERT can add a new type.
Referential Integrity

Advantages of one codes table:
Presumed easier to maintain via a single maintenance application.
Presumed easier on the DBMS to have fewer tables to maintain.

Disadvantages of many small tables:
If the DBMS has some minimum table size on the storage medium, there
may be a lot of wasted space
Each of those tables needs an index (at least for the PK), so similar
space argument there.
Lots of tables for new developers to learn.

Disadvantages of one codes table:
NO referential integrity.
many different types in the same table mean multiple joins to the same
table in many queries.
One PK index must be searched to deeper levels, so likely slower.
This path leads past the OTLT (One True Lookup Table) toward EAV
(Entity Attribute Value) designs. In most cases this ends in pain and
disaster.

I would stick to my guns opposing the one table model as long as
possible.
HTH,
ed


Reply With Quote
  #3  
Old   
CJ
 
Posts: n/a

Default Re: small m-to-n tables - 02-03-2009 , 03:28 AM



On Feb 2, 12:31*pm, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote:
Quote:
On Feb 2, 11:58*am, metaperl <metap... (AT) gmail (DOT) com> wrote:





let's say you have a person table... and a person can have multiple
email_types (home, work) multiple phone_types (work, cell, alternate).
and of course any type can relate to multiple people.

Now, the most error prone and least useful way to model this is to
create enumerated type columns in the person table... it is error
prone because the actual email table will join with the people *table
on this enumerated field and it has to be kept in sync in 2 places.
not only that, but it is more difficult to get a list of email_types
for front-end rendering this way.

The 'standard' way to do this is to:

(1) create an email_types table with 2 columns: id and the name of the
email_type
(2) create a person table with a column indicated preferred email
(3) create a person_email table which has all email values for that
person.

Now , my co-workers are up-in arms about table #1 above... they think
it is counter-productive to have a table with only 2 or 3 items in it.

The compromise has been a 'types_warehouse' table, which would lump
together the email_types and phone_types into a table like so:

CREATE TABLE types_warehose
* INTEGER id
* VARCHAR(255) typeof_type
* VARCHAR(255) type_name

and this table would have entries like:

(1, 'email', 'home')
(2, 'email', 'work')
(3, 'phone', 'cell')
(4, 'phone', 'alternate')

Do you have any feedback on the 3 options for modelling m-to-n
relations? I am in favor of the 'standard' way. Is there a particular
online or book reference that would argue for/against any of these
approaches?

NNNOoooooo!

An entity is an entity no matter how small.

The merged table you describe can work, but it means long ramp up for
new employees maintaining the system. Why, because that table becomes
the kitchen odds and ends drawer, a place called OTHER.

Advantages of the many small specific tables:
Every table represents one thing, so easy to learn and *understand.
Here you have two small tables: PHONE_TYPE and EMAIL_TYPE.
Easy to maintain. A simple INSERT can add a new type.
Referential Integrity

Advantages of one codes table:
Presumed easier to maintain via a single maintenance application.
Presumed easier on the DBMS to have fewer tables to maintain.

Disadvantages of many small tables:
If the DBMS has some minimum table size on the storage medium, there
may be a lot of wasted space
Each of those tables needs an index (at least for the PK), so similar
space argument there.
Lots of tables for new developers to learn.

Disadvantages of one codes table:
NO referential integrity.
many different types in the same table mean multiple joins to the same
table in many queries.
One PK index must be searched to deeper levels, so likely slower.
This path leads past the OTLT (One True Lookup Table) toward EAV
(Entity Attribute Value) designs. In most cases this ends in pain and
disaster.

I would stick to my guns opposing the one table model as long as
possible.
* *HTH,
* ed- Hide quoted text -

- Show quoted text -
I agree completely with Ed's evaluation. Your instincts on the design
were correct, metaperl. Hold your ground. A few months after
implementation, they'll see the light of day.

CJ


Reply With Quote
  #4  
Old   
Philipp Post
 
Posts: n/a

Default Re: small m-to-n tables - 02-03-2009 , 03:56 AM



Quote:
let's say you have a person table... and a person can have multiple email_types (home, work) multiple phone_types (work, cell, alternate). and of course any type can relate to multiple people.
You could also use a CHECK constraint on the type-column if the type
list is short and not likely to be changed frquently.

As others already said, I would also abstain from OTLT/MUCT tables.

brgds

Philipp Post


Reply With Quote
  #5  
Old   
Walter Mitty
 
Posts: n/a

Default Re: small m-to-n tables - 02-03-2009 , 08:59 AM




"metaperl" <metaperl (AT) gmail (DOT) com> wrote

Quote:
let's say you have a person table... and a person can have multiple
email_types (home, work) multiple phone_types (work, cell, alternate).
and of course any type can relate to multiple people.

Now, the most error prone and least useful way to model this is to
create enumerated type columns in the person table... it is error
prone because the actual email table will join with the people table
on this enumerated field and it has to be kept in sync in 2 places.
not only that, but it is more difficult to get a list of email_types
for front-end rendering this way.

The 'standard' way to do this is to:

(1) create an email_types table with 2 columns: id and the name of the
email_type
(2) create a person table with a column indicated preferred email
(3) create a person_email table which has all email values for that
person.

Now , my co-workers are up-in arms about table #1 above... they think
it is counter-productive to have a table with only 2 or 3 items in it.

Counterproductive in what way?

Some people think that a lot of tables in a schema results in complexity
that makes the schema difficult for a newcomer to understand. An answer is
that it's easier to understand 100 tables than it is to understand 100,000
lines of code.

Some people think that the extra joins are going to slow the database engine
down too much. If you have a good database engine with a good optimizer,
and you have adequate hardware, and you index right, join overhead is
easily within the realm of reasonable expectations. The answer is, if you
think it's causing too much overhead, is to try it out and compare it with
the overhead of the solution with fewer tables.

Some people think it's too complicated to come up with queries when tables
involve a bunch of joins. One answer to that is that everything should be
as simple as possible, but not simpler than that. A schema with too few
tables in order to accomodate query writers that don't understand m:n
relationships is trying to make thing too simple. If you really have
programmers and users who are that unsophisticated writing SQL or the like,
create views that denormalize on demand, and let them use the views.

There's more to your original post, but I wanted to get this out of the way
for starters.





Reply With Quote
  #6  
Old   
Marco Mariani
 
Posts: n/a

Default Re: small m-to-n tables - 02-03-2009 , 11:14 AM



Walter Mitty wrote:

Quote:
A schema with too few
tables in order to accomodate query writers that don't understand m:n
relationships is trying to make thing too simple. If you really have
programmers and users who are that unsophisticated writing SQL or the like,
create views that denormalize on demand, and let them use the views.
Another possible reason:
some people don't want/like/are able to write complex sql code and want
to brew their coffee with their ORM of choice. ORMs are often inflexible
in mapping relational concepts and when they are flexible, they require
experience to configure (as I have found myself with sqlalchemy)

In the end, you have schemas that mostly relate 1 java class with 1 sql
table, lots of surrogate keys, data integrity enforced at the
application level, and hic sunt leones.



Reply With Quote
  #7  
Old   
Walter Mitty
 
Posts: n/a

Default Re: small m-to-n tables - 02-04-2009 , 12:41 PM




"Marco Mariani" <marco (AT) sferacarta (DOT) com> wrote

Quote:
Walter Mitty wrote:

A schema with too few tables in order to accomodate query writers that
don't understand m:n relationships is trying to make thing too simple.
If you really have programmers and users who are that unsophisticated
writing SQL or the like, create views that denormalize on demand, and let
them use the views.

Another possible reason:
some people don't want/like/are able to write complex sql code and want to
brew their coffee with their ORM of choice. ORMs are often inflexible in
mapping relational concepts and when they are flexible, they require
experience to configure (as I have found myself with sqlalchemy)

In the end, you have schemas that mostly relate 1 java class with 1 sql
table, lots of surrogate keys, data integrity enforced at the application
level, and hic sunt leones.

Maybe it's just my bias, but I think of this as undoing all the progress
that was made in the area of databases in the 1970s and 1980s. Back to the
bad old days when only the application code knows what's going on.





Reply With Quote
  #8  
Old   
Marco Mariani
 
Posts: n/a

Default Re: small m-to-n tables - 02-05-2009 , 04:00 AM



Walter Mitty wrote:

Quote:
Maybe it's just my bias, but I think of this as undoing all the progress
that was made in the area of databases in the 1970s and 1980s. Back to the
bad old days when only the application code knows what's going on.
I don't know the reasons -- either youngsters trust more The Google
(copy and paste the recipe) than the advice of a mentor (read this book,
carefully)... or the SQL creators and tool vendors have their share of
responsibility in creating a system where it's quite hard to refactor
and re-think choices, when compared to the "application code".
There are web frameworks that, quite disturbingly, encourage the
developers to consider the data model as something low-level that he
should not be concerned with, not anymore, a thing of the past like
buffer overruns.
But it does not surprise me that software is progressing in cycles much
like civilization has done for.. ever.
(I hope we leave the dark age of the Javan Empire soon)


Reply With Quote
  #9  
Old   
Walter Mitty
 
Posts: n/a

Default Re: small m-to-n tables - 02-05-2009 , 01:42 PM




"Marco Mariani" <marco (AT) sferacarta (DOT) com> wrote

Quote:
Walter Mitty wrote:

Maybe it's just my bias, but I think of this as undoing all the progress
that was made in the area of databases in the 1970s and 1980s. Back to
the bad old days when only the application code knows what's going on.

I don't know the reasons -- either youngsters trust more The Google (copy
and paste the recipe) than the advice of a mentor (read this book,
carefully)... or the SQL creators and tool vendors have their share of
responsibility in creating a system where it's quite hard to refactor and
re-think choices, when compared to the "application code".
There are web frameworks that, quite disturbingly, encourage the
developers to consider the data model as something low-level that he
should not be concerned with, not anymore, a thing of the past like buffer
overruns.
But it does not surprise me that software is progressing in cycles much
like civilization has done for.. ever.
(I hope we leave the dark age of the Javan Empire soon)
Some of it is due to the Mario Andretti syndrome: if you're in control, then
you're not going fast enough. Some of it is due to the fact that the CS
professors of five or ten years ago were themselves mostly ignorant of the
powers and perils of databases. They just hadn't been through that part of
the information revolution, so they didn't build it into their curriculum.
Some of it is due to the phenomena you outlined: cut and paste in
preference to learn and build, and the effects of the SQL derby and the
tool derby.

But some of it is because the permanent base of shared data is inherently
harder to go back and rebuild than other parts of the system. Think of an
analogy with a PC. If the RAM craps out, or the power supply goes up in
smoke, you can go out an buy a part, swap it in, and reboot the same
computer. But if the only hard disk fails, and you go out and buy a
replacement, you've essentially created a new computer. Your ability to
reinstall all the things you had on the old one depends on backups, and on
installation CDs and the like.

Databases should be able to respond to changing requirements due to changing
realities. They should also be able to respond when errors and omissions of
the original data analysis come to the surface. But refactoring databases
is never going to be as easy as the rest of system engineering.





Reply With Quote
  #10  
Old   
Hung Vo
 
Posts: n/a

Default Re: small m-to-n tables - 02-05-2009 , 08:22 PM



On Feb 5, 9:00*pm, Marco Mariani <ma... (AT) sferacarta (DOT) com> wrote:
Quote:
There are web frameworks that, quite disturbingly, encourage the
developers to consider the data model as something low-level that he
should not be concerned with, not anymore, a thing of the past like
buffer overruns.
But it does not surprise me that software is progressing in cycles much
like civilization has done for.. ever.
(I hope we leave the dark age of the Javan Empire soon)
I'm not sure if that means something similar to DAO pattern. If that
the case then I do not agree with that. Its not "disturbing" at all to
structure your application into layers and let the data access layer
deal with the heterogony of undelying database systems.


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.