![]() | |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
let's say you have a person table... and a person can have multiple email_types (home, work) multiple phone_types (work, cell, alternate). and of course any type can relate to multiple people. Now, the most error prone and least useful way to model this is to create enumerated type columns in the person table... it is error prone because the actual email table will join with the people *table on this enumerated field and it has to be kept in sync in 2 places. not only that, but it is more difficult to get a list of email_types for front-end rendering this way. The 'standard' way to do this is to: (1) create an email_types table with 2 columns: id and the name of the email_type (2) create a person table with a column indicated preferred email (3) create a person_email table which has all email values for that person. Now , my co-workers are up-in arms about table #1 above... they think it is counter-productive to have a table with only 2 or 3 items in it. The compromise has been a 'types_warehouse' table, which would lump together the email_types and phone_types into a table like so: CREATE TABLE types_warehose * INTEGER id * VARCHAR(255) typeof_type * VARCHAR(255) type_name and this table would have entries like: (1, 'email', 'home') (2, 'email', 'work') (3, 'phone', 'cell') (4, 'phone', 'alternate') Do you have any feedback on the 3 options for modelling m-to-n relations? I am in favor of the 'standard' way. Is there a particular online or book reference that would argue for/against any of these approaches? |
#3
| |||
| |||
|
|
On Feb 2, 11:58*am, metaperl <metap... (AT) gmail (DOT) com> wrote: let's say you have a person table... and a person can have multiple email_types (home, work) multiple phone_types (work, cell, alternate). and of course any type can relate to multiple people. Now, the most error prone and least useful way to model this is to create enumerated type columns in the person table... it is error prone because the actual email table will join with the people *table on this enumerated field and it has to be kept in sync in 2 places. not only that, but it is more difficult to get a list of email_types for front-end rendering this way. The 'standard' way to do this is to: (1) create an email_types table with 2 columns: id and the name of the email_type (2) create a person table with a column indicated preferred email (3) create a person_email table which has all email values for that person. Now , my co-workers are up-in arms about table #1 above... they think it is counter-productive to have a table with only 2 or 3 items in it. The compromise has been a 'types_warehouse' table, which would lump together the email_types and phone_types into a table like so: CREATE TABLE types_warehose * INTEGER id * VARCHAR(255) typeof_type * VARCHAR(255) type_name and this table would have entries like: (1, 'email', 'home') (2, 'email', 'work') (3, 'phone', 'cell') (4, 'phone', 'alternate') Do you have any feedback on the 3 options for modelling m-to-n relations? I am in favor of the 'standard' way. Is there a particular online or book reference that would argue for/against any of these approaches? NNNOoooooo! An entity is an entity no matter how small. The merged table you describe can work, but it means long ramp up for new employees maintaining the system. Why, because that table becomes the kitchen odds and ends drawer, a place called OTHER. Advantages of the many small specific tables: Every table represents one thing, so easy to learn and *understand. Here you have two small tables: PHONE_TYPE and EMAIL_TYPE. Easy to maintain. A simple INSERT can add a new type. Referential Integrity Advantages of one codes table: Presumed easier to maintain via a single maintenance application. Presumed easier on the DBMS to have fewer tables to maintain. Disadvantages of many small tables: If the DBMS has some minimum table size on the storage medium, there may be a lot of wasted space Each of those tables needs an index (at least for the PK), so similar space argument there. Lots of tables for new developers to learn. Disadvantages of one codes table: NO referential integrity. many different types in the same table mean multiple joins to the same table in many queries. One PK index must be searched to deeper levels, so likely slower. This path leads past the OTLT (One True Lookup Table) toward EAV (Entity Attribute Value) designs. In most cases this ends in pain and disaster. I would stick to my guns opposing the one table model as long as possible. * *HTH, * ed- Hide quoted text - - Show quoted text - |
#4
| |||
| |||
|
|
let's say you have a person table... and a person can have multiple email_types (home, work) multiple phone_types (work, cell, alternate). and of course any type can relate to multiple people. |
#5
| |||
| |||
|
|
let's say you have a person table... and a person can have multiple email_types (home, work) multiple phone_types (work, cell, alternate). and of course any type can relate to multiple people. Now, the most error prone and least useful way to model this is to create enumerated type columns in the person table... it is error prone because the actual email table will join with the people table on this enumerated field and it has to be kept in sync in 2 places. not only that, but it is more difficult to get a list of email_types for front-end rendering this way. The 'standard' way to do this is to: (1) create an email_types table with 2 columns: id and the name of the email_type (2) create a person table with a column indicated preferred email (3) create a person_email table which has all email values for that person. Now , my co-workers are up-in arms about table #1 above... they think it is counter-productive to have a table with only 2 or 3 items in it. |
#6
| |||
| |||
|
|
A schema with too few tables in order to accomodate query writers that don't understand m:n relationships is trying to make thing too simple. If you really have programmers and users who are that unsophisticated writing SQL or the like, create views that denormalize on demand, and let them use the views. |
#7
| |||
| |||
|
|
Walter Mitty wrote: A schema with too few tables in order to accomodate query writers that don't understand m:n relationships is trying to make thing too simple. If you really have programmers and users who are that unsophisticated writing SQL or the like, create views that denormalize on demand, and let them use the views. Another possible reason: some people don't want/like/are able to write complex sql code and want to brew their coffee with their ORM of choice. ORMs are often inflexible in mapping relational concepts and when they are flexible, they require experience to configure (as I have found myself with sqlalchemy) In the end, you have schemas that mostly relate 1 java class with 1 sql table, lots of surrogate keys, data integrity enforced at the application level, and hic sunt leones. |
#8
| |||
| |||
|
|
Maybe it's just my bias, but I think of this as undoing all the progress that was made in the area of databases in the 1970s and 1980s. Back to the bad old days when only the application code knows what's going on. |
#9
| |||
| |||
|
|
Walter Mitty wrote: Maybe it's just my bias, but I think of this as undoing all the progress that was made in the area of databases in the 1970s and 1980s. Back to the bad old days when only the application code knows what's going on. I don't know the reasons -- either youngsters trust more The Google (copy and paste the recipe) than the advice of a mentor (read this book, carefully)... or the SQL creators and tool vendors have their share of responsibility in creating a system where it's quite hard to refactor and re-think choices, when compared to the "application code". There are web frameworks that, quite disturbingly, encourage the developers to consider the data model as something low-level that he should not be concerned with, not anymore, a thing of the past like buffer overruns. But it does not surprise me that software is progressing in cycles much like civilization has done for.. ever. (I hope we leave the dark age of the Javan Empire soon) |
#10
| |||
| |||
|
|
There are web frameworks that, quite disturbingly, encourage the developers to consider the data model as something low-level that he should not be concerned with, not anymore, a thing of the past like buffer overruns. But it does not surprise me that software is progressing in cycles much like civilization has done for.. ever. (I hope we leave the dark age of the Javan Empire soon) |
![]() |
| Thread Tools | |
| Display Modes | |
| |