dbTalk Databases Forums  

Database design: How to conquer complexity and information overload

comp.databases comp.databases


Discuss Database design: How to conquer complexity and information overload in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Casper Bang
 
Posts: n/a

Default Database design: How to conquer complexity and information overload - 01-25-2009 , 12:20 PM






At times I come into contact with very large database schema's and
within long start to get headaches from all the detail and complexity,
much of it irrelevant to my cause. This bothers me a great deal, as I
find it horrible to navigate and manage a database with 150 tables,
some of which with 50+ columns.

In most other branches of software engineering, we deal with
complexity by means of divide-and-conquer and raising the abstraction
bar through various means. However, it occurs to me this is rarely
done with physical database design. Database often seem to be be these
huge buckets with a myriad of unknown entry points from the layer
above, or at best, only separated into separate schema's for
authorization reasons.

Having smaller interconnected domain schema's, perhaps revolving
around a central core, but factor out (denormalize into 1:1) domain
specific information into several smaller tables would seem like a
solution.

Is physical database design somehow different, perhaps for performance
reasons? Or have I just been exposed to careless (some would say
pragmatic) design? There does not seem to be much best-practice
material on this topic.

Thanks,
Casper

Reply With Quote
  #2  
Old   
Walter Mitty
 
Posts: n/a

Default Re: Database design: How to conquer complexity and information overload - 01-26-2009 , 09:54 AM







"Casper Bang" <casper.bang (AT) gmail (DOT) com> wrote

Quote:
At times I come into contact with very large database schema's and
within long start to get headaches from all the detail and complexity,
much of it irrelevant to my cause. This bothers me a great deal, as I
find it horrible to navigate and manage a database with 150 tables,
some of which with 50+ columns.

In most other branches of software engineering, we deal with
complexity by means of divide-and-conquer and raising the abstraction
bar through various means. However, it occurs to me this is rarely
done with physical database design. Database often seem to be be these
huge buckets with a myriad of unknown entry points from the layer
above, or at best, only separated into separate schema's for
authorization reasons.

Having smaller interconnected domain schema's, perhaps revolving
around a central core, but factor out (denormalize into 1:1) domain
specific information into several smaller tables would seem like a
solution.

Is physical database design somehow different, perhaps for performance
reasons? Or have I just been exposed to careless (some would say
pragmatic) design? There does not seem to be much best-practice
material on this topic.

Thanks,
Casper
First off, most of your comment addresses logical database design, not
physical database design.

There are plenty of real world subject matters where 150 tables and a total
of say about 700 columns is entirely appropriate to capture the inherent
complexity of the data itself. Without more information, I can't tell you
whether you were looking at a good design of a complex subject matter, or
just a lousy design. There are plenty of both out there. A misinformed
preoccupation with performance is responsible for a lot of bad designs.

A simple and sound logical design that meets the information requirements
will generally provide good enough performance, if other things are right.
Physical design of tables generally mirrors the logical design. Good
physical design of such things as indexes, tablespaces, and table mapping
generally improve performance, but don't result in the complexity you
describe.

Check out the "Handbook of Relational Database Design".




Reply With Quote
  #3  
Old   
Casper Bang
 
Posts: n/a

Default Re: Database design: How to conquer complexity and informationoverload - 01-26-2009 , 10:29 AM



Thanks Walter,

On Jan 26, 4:54*pm, "Walter Mitty" <wami... (AT) verizon (DOT) net> wrote:
Quote:
First off, most of your comment addresses logical database design, not
physical database design.
Right, the two becomes interchangeable over time in large enterprise
systems, where one is considering implicit to the other in an ad-hoc
work flow. If I may rephrase then: Doesn't logical design sometimes
leak into typical physical design and void normalization etc. in the
interest of managability?

Quote:
There are plenty of real world subject matters where 150 tables and a total
of say about 700 columns is entirely appropriate to capture the inherent
complexity of the data itself. *
But since these 700 columns are rarely needed out of a single specific
context, it would seem beneficial to factor out via 1:1 relations into
separate meta-entities. I guess what I am looking for are guidelines
or techniques similar to the various refactorings we utilize in modern
programming, where i.e. many small methods are desirable over one
large, separation of concerns between objects etc. etc.

Though I find it hard to discuss without getting concrete, I'm
starting to believe it's simply "ok" to have very large schemas with
many columns, and that the complexity is meant to be dealth with only
in the tier above. The problem is, with JPA anyway, I don't see how
you avoid pulling in dependencies for each and every related table and
so you are exposed to everything anyway.

/Casper


Reply With Quote
  #4  
Old   
Walter Mitty
 
Posts: n/a

Default Re: Database design: How to conquer complexity and information overload - 01-26-2009 , 12:35 PM




"Casper Bang" <casper.bang (AT) gmail (DOT) com> wrote

Thanks Walter,

Quote:
On Jan 26, 4:54 pm, "Walter Mitty" <wami... (AT) verizon (DOT) net> wrote:
First off, most of your comment addresses logical database design, not
physical database design.

Right, the two becomes interchangeable over time in large enterprise
systems, where one is considering implicit to the other in an ad-hoc
work flow. If I may rephrase then: Doesn't logical design sometimes
leak into typical physical design and void normalization etc. in the
interest of managability?
The history of the use of the terms "logical design" and "physical design"
has taken a few unfortunate turns. (Alternatively you could say the same
thing about the terms "logical data model" and "physical data model"). Each
of the experts means something very, very precise by the logical/physical
distinction. Unfortunately, different experts use the terms in mutually
contradictory ways.

The way I tend to use it (not necessarily the right way) is that logical
features are visible to programmers, while physical features are transparent
to programmers, except for performance effects. Thus if you have a table
that's been decomposed into two separate tables, the programming that has
been written against the composite table has to be rewritten to deal with
the separate tables. Sometimes, the actual work of revising the app can be
minimized by creating a view that looks just like the old table, and maybe
even has the same name as the old table. In that case queries that
referenced the old table survive with no maintainance.

There are many times when one is given a legacy database and all one has is
the existing database metadata. You can use tools that reverse engineer the
existing metadata back into a bunch of "CREATE TABLE" statements, but that's
about it. What was going on in the mind of the designer is gone. This is
unfortunate. Undocumented data can be as intractable as undocumented code.
In a better world, logical design would be preserved in a document for later
maintainers to use for learning purposes. There are tools that maintain the
model in a better form than in a text document, but you get the idea.

Sometimes, logical design doesn't follow all the normalization rules,
particularly when I'm the designer. I'm not sure whether or not that's what
you meant by the physical design "leaking" into the logical design. I
wouldn't have used such phrasing.

One of the biggest banes in large enterprises is the vast amount of
undocumented data. When databases and DBMS software were dreamed up in the
1960s and 1970s, the biggest reason for using databases was overcoming the
problem of undocumented data. Decades later, faced with an army of database
architects who never grasped this simple fact, we're back in the soup
again. To be fair, some of the current architects understood the danger of
undocumented data, but weren't being paid to develop the documents.

Sorry about this rambling response.




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.