![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hi All, I am new to MS Analysis services. We have to develop cube for our client. The cube is developed by someone and i have to do some modifications. There is one dimension "xx" from fact table.The frequency of updates is high and no of records are less(around 30,000). Shall we create new dimension table for this "xx" or dimension from the fact table is okay? What is the difference between the two? When do we need to go for the one? If it is seperate dimension table, is it static or changing? Please advice me on this. Thanks in advance Suchi |
#3
| |||
| |||
|
|
While this can sometimes be a religious debate, I think that most folks will say that having an independent dimension table is extremely important. First, it allows you to capture metadata about the dimension which you are probably not carrying in the fact table. For example, an independent dimension table for "Time" would have additional information on weekend, holidays, etc. -- you probably don't carry that in your fact table -- normally you would have just a timestamp. Second, it allows you to have members without matching data. Sometimes this can be important. Again, using Time as an example, you might not have sales over the weekend, but you may need to include the member itself during your analysis. Lastly and very importantly, a dimension based on the fact table does not scale. I work with customer having billions and billions of rows in their fact tables -- having to do dimension processing based on those structures just isn't practical. We have techniques to scale fact tables that large, e.g. partitions -- but we don't have similar techniques for dimension tables. Stick with independent dimension tables. In the long run, you will be much happier. Hope that helps. |
#4
| |||
| |||
|
|
Hello I am newbie as well, still gathering my resource material I am unclear about the criteria for fact table In my situation I have a set of time stamped input data for each Item, from which I generate gobs of Stats data for many variations of (sets of items), each for many variations of time attributes (aggregation past N days, per day of week, per month of year etc etc). And I'll have stored procedures for calculating many of the statistical data derived from the raw data. I can see having 1. One table for each Item (ItemID,DateID, data1,dat2...) for the raw data 2. One table for each Stats(StatID,DateID,ItemCategoryID, time1, time2, ) for calculated stat for each time sensitivity, for each Item associated to the ItemCategryID 3. Dimension for Items with Levels for different categories of Items 4. Dimension for Stats with Levels for different categories of Stats 5. Dimension for Time with Levels for different categories of Time Now lets say I have a 100 Items and 100 Stats this would give me 100 x 100 = 10000 Tables for Stats. Am I losing it here? Should I be reducing the number of Tables? Thanks for any suggestions, or recommended resource materials Kurb Dave Wickert [MSFT] wrote: While this can sometimes be a religious debate, I think that most folks will say that having an independent dimension table is extremely important. First, it allows you to capture metadata about the dimension which you are probably not carrying in the fact table. For example, an independent dimension table for "Time" would have additional information on weekend, holidays, etc. -- you probably don't carry that in your fact table -- normally you would have just a timestamp. Second, it allows you to have members without matching data. Sometimes this can be important. Again, using Time as an example, you might not have sales over the weekend, but you may need to include the member itself during your analysis. Lastly and very importantly, a dimension based on the fact table does not scale. I work with customer having billions and billions of rows in their fact tables -- having to do dimension processing based on those structures just isn't practical. We have techniques to scale fact tables that large, e.g. partitions -- but we don't have similar techniques for dimension tables. Stick with independent dimension tables. In the long run, you will be much happier. Hope that helps. |
#5
| |||
| |||
|
|
Sounds like way to many... You don' need a table in 1 for each item - you already have a itemid ins this table that you can join to a dimension table. What you want to do dimension-wise needs to be controlled by this dimension. The typical case is ONE fact tabel and 10-20 dimensions... Then of course you can merge physical cubes into one total virtual cube - if so desired. /Michael V. "kurb" <mompara (AT) sympatico (DOT) ca> wrote in message news:5rwkd.22978$Z7.772365 (AT) news20 (DOT) bellglobal.com... Hello I am newbie as well, still gathering my resource material I am unclear about the criteria for fact table In my situation I have a set of time stamped input data for each Item, from which I generate gobs of Stats data for many variations of (sets of items), each for many variations of time attributes (aggregation past N days, per day of week, per month of year etc etc). And I'll have stored procedures for calculating many of the statistical data derived from the raw data. I can see having 1. One table for each Item (ItemID,DateID, data1,dat2...) for the raw data 2. One table for each Stats(StatID,DateID,ItemCategoryID, time1, time2, ) for calculated stat for each time sensitivity, for each Item associated to the ItemCategryID 3. Dimension for Items with Levels for different categories of Items 4. Dimension for Stats with Levels for different categories of Stats 5. Dimension for Time with Levels for different categories of Time Now lets say I have a 100 Items and 100 Stats this would give me 100 x 100 = 10000 Tables for Stats. Am I losing it here? Should I be reducing the number of Tables? Thanks for any suggestions, or recommended resource materials Kurb Dave Wickert [MSFT] wrote: While this can sometimes be a religious debate, I think that most folks will say that having an independent dimension table is extremely important. First, it allows you to capture metadata about the dimension which you are probably not carrying in the fact table. For example, an independent dimension table for "Time" would have additional information on weekend, holidays, etc. -- you probably don't carry that in your fact table -- normally you would have just a timestamp. Second, it allows you to have members without matching data. Sometimes this can be important. Again, using Time as an example, you might not have sales over the weekend, but you may need to include the member itself during your analysis. Lastly and very importantly, a dimension based on the fact table does not scale. I work with customer having billions and billions of rows in their fact tables -- having to do dimension processing based on those structures just isn't practical. We have techniques to scale fact tables that large, e.g. partitions -- but we don't have similar techniques for dimension tables. Stick with independent dimension tables. In the long run, you will be much happier. Hope that helps. |
#6
| |||
| |||
|
![]() |
| Thread Tools | |
| Display Modes | |
| |