dbTalk Databases Forums  

PK for Fact Table?

comp.databases comp.databases


Discuss PK for Fact Table? in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
dataman
 
Posts: n/a

Default PK for Fact Table? - 07-30-2006 , 04:43 PM






I've been struggling with why I need one of these. All of
Ralph's writings say the PK is the a composite of the
dimension FK's. Well OK, but why define one at all? All my
queries are sums or counts. Any thoughts?

Reply With Quote
  #2  
Old   
Bill Karwin
 
Posts: n/a

Default Re: PK for Fact Table? - 07-31-2006 , 12:43 AM






dataman wrote:
Quote:
I've been struggling with why I need one of these. All of
Ralph's writings say the PK is the a composite of the
dimension FK's. Well OK, but why define one at all? All my
queries are sums or counts. Any thoughts?
It's difficult for me to visualize the table structure you are talking
about, based on that description. There is not enough context to know
what you mean by several of the terms you use (such as "fact table").

But in general, you do need a PK in a table. Otherwise, you may end up
with duplicate identical rows, and also there will be no way of
addressing individual rows.

A primary key is a way to identify each row uniquely. This can be a
"natural key" which is some column (or columns) which is unique over the
table. Or it can be a "pseudokey" which is typically a monotonically
increasing integer column. In some tables, foreign keys to other tables
form a sensible choice for a composite primary key.

Regards,
Bill K.


Reply With Quote
  #3  
Old   
David Portas
 
Posts: n/a

Default Re: PK for Fact Table? - 07-31-2006 , 01:26 AM



dataman wrote:
Quote:
I've been struggling with why I need one of these. All of
Ralph's writings say the PK is the a composite of the
dimension FK's. Well OK, but why define one at all? All my
queries are sums or counts. Any thoughts?

No surprise that you struggle with Kimball. In so far as they are
explained at all, his ideas on logical modelling are not very sound.

Fact tables are not different to other tables so the rationale for keys
is not different either. Study some proper relational basics before you
go near RK.

--
David Portas



Reply With Quote
  #4  
Old   
dataman
 
Posts: n/a

Default Re: PK for Fact Table? - 07-31-2006 , 10:22 AM



Quote:
dataman wrote:
I've been struggling with why I need one of these. All
of Ralph's writings say the PK is the a composite of the
dimension FK's. Well OK, but why define one at all?
All my queries are sums or counts. Any thoughts?

It's difficult for me to visualize the table structure you
are talking about, based on that description. There is
not enough context to know what you mean by several of
the terms you use (such as "fact table").
But in general, you do need a PK in a table. Otherwise,
you may end up with duplicate identical rows, and also
there will be no way of addressing individual rows.

A primary key is a way to identify each row uniquely.
This can be a "natural key" which is some column (or
columns) which is unique over the table. Or it can be a
"pseudokey" which is typically a monotonically increasing
integer column. In some tables, foreign keys to other
tables form a sensible choice for a composite primary
key.
Regards,
Bill K.
Fact table refers to Ralph Kimball's dimensional modeling
for analytics/data mining reporting. I wholeheartedly agree
the a PK is requisite for OLTP modeling. It's 1NF. But in
dimensional modeling I don't think I need one. I never do
updates, only inserts managed through ETL load processes.
Thanks for the replies.


Reply With Quote
  #5  
Old   
Bill Karwin
 
Posts: n/a

Default Re: PK for Fact Table? - 07-31-2006 , 12:39 PM



dataman wrote:
Quote:
Fact table refers to Ralph Kimball's dimensional modeling
for analytics/data mining reporting. I wholeheartedly agree
the a PK is requisite for OLTP modeling. It's 1NF. But in
dimensional modeling I don't think I need one. I never do
updates, only inserts managed through ETL load processes.
Thanks for the replies.
Aha. Thanks, I had no idea whom you meant when you said simply "Ralph".
:-) I usually assume that questions are in regard to traditional OLTP
modeling, unless the question is very clear that it's about some other
paradigm.

Ok, I've found RK's dimensional modeling manifesto. Interesting reading!

It seems to me that if you are doing sums and counts on the fact table,
that you'd need a PK constraint to ensure that each row is a unique
instance of some fact. Otherwise you don't know if a fact has been
erroneously recorded more than once. If this is the case, your sums and
counts will be inaccurate.

If there is a case where there is a legitimate need to store the fact
twice, then having a PK in place on the fact table identifies these
cases, which should indicate that you need another dimension table to
distinguish the instances of these otherwise identical facts. If you
are prohibited from inserting a row to the fact table because of the PK
constraint, it means your dimensional model is incomplete for that fact
table.

I can't speak to the nuances of Ralph Kimball's theories, but regardless
of the difference between ER modeling and DM modeling, it should be a
requirement that any single row in any table should be logically
identifiable. The difference is whether that is done by an attribute or
by a dimension.

Can you give a specific example of a case where this doesn't hold true?

Regards,
Bill K.


Reply With Quote
  #6  
Old   
David Cressey
 
Posts: n/a

Default Re: PK for Fact Table? - 08-02-2006 , 06:04 AM




"dataman" <dataman (AT) mail (DOT) ev1.net> wrote

Quote:
dataman wrote:
I've been struggling with why I need one of these. All
of Ralph's writings say the PK is the a composite of the
dimension FK's. Well OK, but why define one at all?
All my queries are sums or counts. Any thoughts?

It's difficult for me to visualize the table structure you
are talking about, based on that description. There is
not enough context to know what you mean by several of
the terms you use (such as "fact table").
But in general, you do need a PK in a table. Otherwise,
you may end up with duplicate identical rows, and also
there will be no way of addressing individual rows.

A primary key is a way to identify each row uniquely.
This can be a "natural key" which is some column (or
columns) which is unique over the table. Or it can be a
"pseudokey" which is typically a monotonically increasing
integer column. In some tables, foreign keys to other
tables form a sensible choice for a composite primary
key.
Regards,
Bill K.
Fact table refers to Ralph Kimball's dimensional modeling
for analytics/data mining reporting. I wholeheartedly agree
the a PK is requisite for OLTP modeling. It's 1NF. But in
dimensional modeling I don't think I need one. I never do
updates, only inserts managed through ETL load processes.
Thanks for the replies.
Dataman,

There are really two questions here, as I read it.

The first is whether any candidate key exists. The second is whether it's
necessary to declare a PK.

The combination of all the FKs in a fact table is clearly a candidate key,
whether it's declared or not. The fact table isn't necessarily in 2NF, so
it's possible that some subset of the combination of all the FK's might be a
candidate key as well.


If you declare the PK in the fact table, the RDBMS will detect duplicate
fact table entries for you. This could be useful when loading the table,
but not when consulting it.

Some DB servers declare an index for you when you declare a PK, and
sometimes a so called "star join" can proceed faster with such an index
declared. That's what I've read. I can't say for sure.

I have always declared a PK on my fact tables, on the theory that I may need
it someday, and I'm willing to incur the overhead of it.

YMMV.





Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.