![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
My customer has a table that is fairly similar to the below that they currently used for reporting purpose. To reduce SQL joins, this table has been de-normalized. I am supposed to build an OLAP cube using this table as the Fact Table as well as the Dimension Tables. I know that I can still build my Dimensions (like Customer, Industry, MarketSegment, etc) from the Fact Table. But this does not look like a standard design whereby we have separate Dimension Tables for Customer, Industry and Market Segment. Would this design speed up or slower down cube processing? What are the other disadvantages to this design? Fact Table: TimeID CustomerID CustomerName IndustryID IndustryDesc MarketSegmentID MarketSegementDesc ... SalesAmount ProfitMargin ... |
#3
| |||
| |||
|
|
Hi Pebble! It is never a good idea to build Dimensions off of a fact table even though all the books and even the MOC course show you how to do this. Why? If you have dimension members that do not have associated facts then you will never see the member - so for a time dimension build on a fact table for a shop that opens monday to saturday and sometimes on a sunday, your time dimension will show most weeks have 6 days and a few have 7! The other reason is processing time for dimensions - Imagine you have a sales fact with millions of rows and the Time dimension is being built of of this table - it has to process millions of rows to build the fact rather than the few hundred if it had it's own table. Hope this helps! Paul "Pebble In The Stream" wrote: My customer has a table that is fairly similar to the below that they currently used for reporting purpose. To reduce SQL joins, this table has been de-normalized. I am supposed to build an OLAP cube using this table as the Fact Table as well as the Dimension Tables. I know that I can still build my Dimensions (like Customer, Industry, MarketSegment, etc) from the Fact Table. But this does not look like a standard design whereby we have separate Dimension Tables for Customer, Industry and Market Segment. Would this design speed up or slower down cube processing? What are the other disadvantages to this design? Fact Table: TimeID CustomerID CustomerName IndustryID IndustryDesc MarketSegmentID MarketSegementDesc ... SalesAmount ProfitMargin ... |
#4
| |||
| |||
|
|
Paul, thanks for your advice. I do understand your point about missing values in dimension. We do intend to have a separate dimension table for Time. But the customer is not very convinced and is reluctant to have her DTS team create and populate the other Dimension tables. At least, I have your affirmative advice on the impact on processing time now. What about this: Can I create Dimension views instead of tables? The Dimensions view is basically a select distinct SQL on the Fact table. Then I will create Dimension using the view instead of the Fact table. Will this help in the processing? "Paul" wrote: Hi Pebble! It is never a good idea to build Dimensions off of a fact table even though all the books and even the MOC course show you how to do this. Why? If you have dimension members that do not have associated facts then you will never see the member - so for a time dimension build on a fact table for a shop that opens monday to saturday and sometimes on a sunday, your time dimension will show most weeks have 6 days and a few have 7! The other reason is processing time for dimensions - Imagine you have a sales fact with millions of rows and the Time dimension is being built of of this table - it has to process millions of rows to build the fact rather than the few hundred if it had it's own table. Hope this helps! Paul "Pebble In The Stream" wrote: My customer has a table that is fairly similar to the below that they currently used for reporting purpose. To reduce SQL joins, this table has been de-normalized. I am supposed to build an OLAP cube using this table as the Fact Table as well as the Dimension Tables. I know that I can still build my Dimensions (like Customer, Industry, MarketSegment, etc) from the Fact Table. But this does not look like a standard design whereby we have separate Dimension Tables for Customer, Industry and Market Segment. Would this design speed up or slower down cube processing? What are the other disadvantages to this design? Fact Table: TimeID CustomerID CustomerName IndustryID IndustryDesc MarketSegmentID MarketSegementDesc ... SalesAmount ProfitMargin ... |
#5
| |||
| |||
|
|
Paul, thanks for your advice. I do understand your point about missing values in dimension. We do intend to have a separate dimension table for Time. But the customer is not very convinced and is reluctant to have her DTS team create and populate the other Dimension tables. At least, I have your affirmative advice on the impact on processing time now. What about this: Can I create Dimension views instead of tables? The Dimensions view is basically a select distinct SQL on the Fact table. Then I will create Dimension using the view instead of the Fact table. Will this help in the processing? "Paul" wrote: Hi Pebble! It is never a good idea to build Dimensions off of a fact table even though all the books and even the MOC course show you how to do this. Why? If you have dimension members that do not have associated facts then you will never see the member - so for a time dimension build on a fact table for a shop that opens monday to saturday and sometimes on a sunday, your time dimension will show most weeks have 6 days and a few have 7! The other reason is processing time for dimensions - Imagine you have a sales fact with millions of rows and the Time dimension is being built of of this table - it has to process millions of rows to build the fact rather than the few hundred if it had it's own table. Hope this helps! Paul "Pebble In The Stream" wrote: My customer has a table that is fairly similar to the below that they currently used for reporting purpose. To reduce SQL joins, this table has been de-normalized. I am supposed to build an OLAP cube using this table as the Fact Table as well as the Dimension Tables. I know that I can still build my Dimensions (like Customer, Industry, MarketSegment, etc) from the Fact Table. But this does not look like a standard design whereby we have separate Dimension Tables for Customer, Industry and Market Segment. Would this design speed up or slower down cube processing? What are the other disadvantages to this design? Fact Table: TimeID CustomerID CustomerName IndustryID IndustryDesc MarketSegmentID MarketSegementDesc ... SalesAmount ProfitMargin ... |
![]() |
| Thread Tools | |
| Display Modes | |
| |