![]() | |
![]() |
| | Thread Tools | Display Modes |
#11
| |||
| |||
|
|
From the DDL I assume you are using your application (Natural) keys to link from the fact table to the dim tables. You would get a performance boost if you took it one more step and used all surrogate (identity) keys to link instead of char columns. |
|
This is not that bad to do, you use what you have as a staging area and then generate your dim tables with the same info but with an integer identity and store that linking integer column in your fact table. |
#12
| |||
| |||
|
|
"Ray Higdon" <rayhigdon (AT) higdonconsulting (DOT) com> wrote in message news:O7cTheaoDHA.1020 (AT) TK2MSFTNGP09 (DOT) phx.gbl... From the DDL I assume you are using your application (Natural) keys to link from the fact table to the dim tables. You would get a performance boost if you took it one more step and used all surrogate (identity) keys to link instead of char columns. I realize chars are not that "popular" choice (took DB over and couldn't change them). I suspect integers are acceptable, right? This is not that bad to do, you use what you have as a staging area and then generate your dim tables with the same info but with an integer identity and store that linking integer column in your fact table. If I read this correctly, the design that I have, besides char PK's, is what a "normal" cube should look like. Now, just one stupid question: do you think that integers over chars would really mean that kind of performance boost (minutes over hours)? Is cross-join of 2 one-thousand dimensions supposed to be slow and how (non-withstanding hardware). I'm asking because I honestly don't know; haven't seen what other people have or use. I base my expectations purely on my rdb experience. Thanks once again for your time. lc |
#13
| |||
| |||
|
|
Yes, integers are preferred, It's probably not going to make that much of a difference, just curious, do you have the latest service pack for Analysis Services installed? With the numbers and DDL you have provided, you should not be experiencing that poor of performance. |
|
There is a good article (pretty lengthy though)on MS that may help as well. |
Thanks a lot for your patience.
#14
| |||
| |||
|
|
"Thomas Ivarsson" <TI_Nospam (AT) hotmail (DOT) com> wrote in message news:O0IwQ9VoDHA.2964 (AT) tk2msftngp13 (DOT) phx.gbl... I am not sure of all your design choices but I think I can make some general suggestions. Me neither ![]() that intersect, and avoid combinations that generate to many empty cells. You have always the option to join separate physical cubes in virtual cubes. I, sort of, sensed this might be an issue and I went ahead and cut them down to a smaller number. Either way I see it, I have to come up with 12 dimensions. Among these, I can't favor any single one nor can I afford to omit it as all of them are frequently used. While trying to figure out what might be the problem, I removed all dimensions but 5. One Time dimension, two standard one-level (code to code) and two two-level standard dimensions. I still see no visible improvement. One thing is for sure, when dimensions with small number of elements are involved, everything is very fast. Client, however, is a dimension that consists of 10000 elements. When used in any kind of cross-join, client slows down to a crawl. I **am** limiting number of clients to show down to hundreds (sliced by Company and limited to Time Q4 of 2003) but this doesn't seem to speed things up. TI: Do you have a lot of member properties at the lowest level? Do you use a lot of calculated members? |
|
When you crossjoin different dimensions at the leaf/lowest level (put them on the same axis) you create a big performance problem. And here is where I stand in shock. I though that's what OLAP is all about. I can generate same outputs from my RDB in a nick of time. Why would this be so complicated for OLAP/AS. In concept, it's the same thing: give me this and that where this is in (list) and that > whatever and otherthing.... Is PTS combined with AS a wrong choice? I'm confused. Sorry. |
|
Aggregations are always done from the top(ie all member level in each dimension) and then downwards to more detailed levels, depending on your degree of aggregations. I recommend to start with 10-20 percent aggregation. I do not think that aggregation is an issue here. Probably not as I tried with option "when performance gain reaches 100%" and I haven't seen any speed improvement. Please don't anyone get me wrong, I do understand that I'm not familiar with data warehousing theory/practise, but my 20 years with databases in general say something's not right here. Having dimensions that join with fact table on code-to-code - that's all that I did. What kind of theory doesn't lead to this kind of structure. I'm not going for anything obscure - no star schemas, no dynamic dimensions/cubes... nothing fancy; and yet... Thanks for your help and your time. Very appreciated. lc |
#15
| |||
| |||
|
|
TI: Do you have a lot of member properties at the lowest level? Do you use a lot of calculated members? |
|
TI: No OLAP is most about aggregations. The engine creates levels above the leaf level in each dimension. When you crossjoin at the leaf level you combine all the members in the crossjoined dimension. So with two dimension of 1000 leaf members in each you will get 1000000 combinations. A solution is to use the nonemptycrossjoin function i an MDX-statement. |
#16
| |||
| |||
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||| | |
![]() |
| Thread Tools | |
| Display Modes | |
| |