dbTalk Databases Forums  

Single fact table, without dimension tables

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss Single fact table, without dimension tables in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Pluto
 
Posts: n/a

Default Single fact table, without dimension tables - 05-05-2004 , 09:59 AM






Hi,

I have to incorporate in a cube data coming from different sources. This
data is a flat text file, the result of a an sql query with multiple joins.
"Parent" data is being repeated over and over, for each "Child" data.

I intend to store this file in a single fact table, and use the same fact
table to create the different dimensions. A short test has shown me it
works.

I understand that this way of storing data is not optimal: an foreign key is
certainly smaller than anything else. But this way, I can avoid recreating
an sql structure, by splitting data into a pure fact table, and dimension
tables, with keys and joins inbetween.

In this context, does it make sense to use one single fact table, and no
dimension table, in order to build an olap cube? Is the resulting OLAP cube
supposed to be correct, even if similar dimensions rows are being repeated
several times?

Thanks

Philippe Lang



Reply With Quote
  #2  
Old   
Learner
 
Posts: n/a

Default Re: Single fact table, without dimension tables - 05-05-2004 , 01:34 PM






Hi Philippe,

Welcome to the jungle.

A few weeks back (26-Mar-04 to be precise) I started out with a pretty
similar question and am sharing the following based on the replies and
suggestions that I got from so many helpful folks here. Before I go on,
I would suggest that DO try to go through the various posts and these
are EXCELLENT learning ground.

I also started out using a single table as the fact as well basing the
dimensions on it...u know what...I was getting results too but never
checked them because I learned that my approach was wrong.

1) Go through posts with the subject "ETL"... you need to do some
transformations which will extract the tables for your underlying
dimensions. I'm told that the best approach is to base your fact and
dimensions on views rather than tables. I am doing this but to be honest
am not clear about the advantage... Well what do you expect from a
learner

2) Make sure that you use data types which occupy small 'space'... treat
this as a watch out factor

3) Make surrogate keys... Again something that I have learned from here
and these is supposed to be the 'right approach'

4) In summary do not use a single table for your fact as well as
dimensions. You might be getting the results (and maybe even right
results) but you will run into problems soon. What problems? This is
where the gurus will come in...

I hope I have ben able to reply to at least some of your questions. I
really hope so because this is my FIRST reply where I feel that I am
sharing whatever I have gained from here and am trying to help out
another "Learner"

Sorry for the lengthy post.

Good Luck.
P.S.
Log in the hours and the efforts and you'll be excited with where you
stand in just a few weeks.

Oh yes,,,,, and then you will also be running into a front-end OLAP tool
selection. This part I will not comment on YET and am in the evaluation
process myself..have come across some neat tools



Quote:
|Hi,
|
|I have to incorporate in a cube data coming from different sources. This
|data is a flat text file, the result of a an sql query with multiple joins.
|"Parent" data is being repeated over and over, for each "Child" data.
|
|I intend to store this file in a single fact table, and use the same fact
|table to create the different dimensions. A short test has shown me it
|works.
|
|I understand that this way of storing data is not optimal: an foreign key is
|certainly smaller than anything else. But this way, I can avoid recreating
|an sql structure, by splitting data into a pure fact table, and dimension
|tables, with keys and joins inbetween.
|
|In this context, does it make sense to use one single fact table, and no
|dimension table, in order to build an olap cube? Is the resulting OLAP cube
|supposed to be correct, even if similar dimensions rows are being repeated
|several times?
|
|Thanks
|
|Philippe Lang
|
|
|
--

--
Thanks.


Reply With Quote
  #3  
Old   
Pluto
 
Posts: n/a

Default Re: Single fact table, without dimension tables - 05-06-2004 , 04:47 AM



Hello,

Thanks for your answer. I have been searching for the thread you are talking
about, but I couldn't find it, sorry.

I have also made some searches regarding ETL techniques, and yes, a fact
table and separate dimension tables seems to be the way to go.
Unfortunately, with a flat textfile, based on an sql query with joins, it
requires a lot of efforts when loading data into the database.

You are talking about "views". What about doing things like this:

1) One single "fact + dimensions" table
2) One fact view of this table: a SELECT of the "fact" columns, the highest
granularity of the query
3) Several dimension views of this table: a SELECT DISTINCT ROW of the
"dimension" columns

It's certainly a wrong approach, but why exactly? What kind of problems
could I run into?

Thanks


"Learner" <wantnospam (AT) email (DOT) com> a écrit dans le message de
news:MPG.1b038a678d1a3049896ea (AT) msnews (DOT) microsoft.com...
Quote:
Hi Philippe,

Welcome to the jungle.

A few weeks back (26-Mar-04 to be precise) I started out with a pretty
similar question and am sharing the following based on the replies and
suggestions that I got from so many helpful folks here. Before I go on,
I would suggest that DO try to go through the various posts and these
are EXCELLENT learning ground.

I also started out using a single table as the fact as well basing the
dimensions on it...u know what...I was getting results too but never
checked them because I learned that my approach was wrong.

1) Go through posts with the subject "ETL"... you need to do some
transformations which will extract the tables for your underlying
dimensions. I'm told that the best approach is to base your fact and
dimensions on views rather than tables. I am doing this but to be honest
am not clear about the advantage... Well what do you expect from a
learner

2) Make sure that you use data types which occupy small 'space'... treat
this as a watch out factor

3) Make surrogate keys... Again something that I have learned from here
and these is supposed to be the 'right approach'

4) In summary do not use a single table for your fact as well as
dimensions. You might be getting the results (and maybe even right
results) but you will run into problems soon. What problems? This is
where the gurus will come in...

I hope I have ben able to reply to at least some of your questions. I
really hope so because this is my FIRST reply where I feel that I am
sharing whatever I have gained from here and am trying to help out
another "Learner"

Sorry for the lengthy post.

Good Luck.
P.S.
Log in the hours and the efforts and you'll be excited with where you
stand in just a few weeks.

Oh yes,,,,, and then you will also be running into a front-end OLAP tool
selection. This part I will not comment on YET and am in the evaluation
process myself..have come across some neat tools



|Hi,
|
|I have to incorporate in a cube data coming from different sources.
This
|data is a flat text file, the result of a an sql query with multiple
joins.
|"Parent" data is being repeated over and over, for each "Child" data.
|
|I intend to store this file in a single fact table, and use the same
fact
|table to create the different dimensions. A short test has shown me it
|works.
|
|I understand that this way of storing data is not optimal: an foreign
key is
|certainly smaller than anything else. But this way, I can avoid
recreating
|an sql structure, by splitting data into a pure fact table, and
dimension
|tables, with keys and joins inbetween.
|
|In this context, does it make sense to use one single fact table, and
no
|dimension table, in order to build an olap cube? Is the resulting OLAP
cube
|supposed to be correct, even if similar dimensions rows are being
repeated
|several times?
|
|Thanks
|
|Philippe Lang
|
|
|

--

--
Thanks.



Reply With Quote
  #4  
Old   
Learner
 
Posts: n/a

Default Re: Single fact table, without dimension tables - 05-06-2004 , 08:49 PM



Hi,

I don't want to misguide you since I'm a starter myself. Please allow me
a little time to get back to you with useful replies to your actual
questions. Till then However please see below for an initial
suggestions:

Look out for replies by the following people. However I must
clarify that there are MANY knowledgeable folks over here
(microsoft.public.sqlserver.olap)and I am only listing some of the names
whose replies have REALLY helped me (or whose replies to someone else's
questions have added to my knowledge and clarifying my concepts). BTW,
the order in they are listed does not represent any thing and I
sincerely that hope I will not offend anyone by listing the
following/missing out their name:

Michael Vardinghus, Andrej Hudoklin, Graeme Scott, Steve Kass, Alejandro
Mesa, Tom Chester, Abhinav Kumar, Dave Wickert, Nigel Pendse, Allan
Mitchell, Deepak Puri, Andrea Temporin, and many more.

Okay enough of the Grammy nomination thank you speech (Ooppss... I
forgot to mention my family members).


Regards.
P.S.
I am surprised why none of the folks have replied to your post yet???
Unless they've sent you reply via email.

Reply With Quote
  #5  
Old   
Pluto
 
Posts: n/a

Default Re: Single fact table, without dimension tables - 05-07-2004 , 01:54 AM



Hello,

Thanks for you feedback,

I'm also surprised my post did not trig reactions from the OLAP gurus of
this newsgroup...

Meanwhile, I found interesting threads on Internet regarding my question:

1. One giant table
http://www.dbforums.com/t897629.html

2. Degenerate dimensions
http://199.239.255.16/html/articles.html

I'm really tempted doing my job with a giant table, and database views. I
understand this may not be optimal, from a storage and processing time point
of view, but clearly, the final database won't be a terrabyte monster
either. As long as the cube presents coherent data, and it does not take 2
weeks to build, really, I don't care at all...



"Learner" <wantnospam (AT) email (DOT) com> a écrit dans le message de
news:MPG.1b0541e910c772509896ec (AT) msnews (DOT) microsoft.com...
Quote:
Hi,

I don't want to misguide you since I'm a starter myself. Please allow me
a little time to get back to you with useful replies to your actual
questions. Till then However please see below for an initial
suggestions:

Look out for replies by the following people. However I must
clarify that there are MANY knowledgeable folks over here
(microsoft.public.sqlserver.olap)and I am only listing some of the names
whose replies have REALLY helped me (or whose replies to someone else's
questions have added to my knowledge and clarifying my concepts). BTW,
the order in they are listed does not represent any thing and I
sincerely that hope I will not offend anyone by listing the
following/missing out their name:

Michael Vardinghus, Andrej Hudoklin, Graeme Scott, Steve Kass, Alejandro
Mesa, Tom Chester, Abhinav Kumar, Dave Wickert, Nigel Pendse, Allan
Mitchell, Deepak Puri, Andrea Temporin, and many more.

Okay enough of the Grammy nomination thank you speech (Ooppss... I
forgot to mention my family members).


Regards.
P.S.
I am surprised why none of the folks have replied to your post yet???
Unless they've sent you reply via email.



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.