![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| ||||
| ||||
|
|
- a transaction table, also used as fact table. |
|
The table has about 200 fields. |
|
Should I create a copy of this table with only user and key fields in it? |
|
I give up. I tried everything I thought I knew and the damn thing is still way too slow. I would highly appreciate if you someone could give me a few moments of your time to go through this (lengthy) question: Here's what I have: - a transaction table, also used as fact table. The table has about 200 fields. Of those, I use perhaps 30 for linking in dimensions and <10 for measures. Is this a smart thing to do? Should I create a copy of this table with only user and key fields in it? - each transaction record is stamped with a weekending date, sales rep code, client code, contact code, manager code, branch code and type of business (3 elements in corresponding dimension). - I created one dimension for sales rep, one for branch and one for business type - the client dimension has two levels, company name and contact name - finally, I created a time dimension for weekending date The cube is MOLAP and was processed with 50% of optimization. I also tried with 100% with no change in performance. Originally, the cube was done on SQL7 OLAP and I just finished getting disappointed with performance on 2000 OA. I'm sure this must be something I did (or didn't do). Now, O open excel and: - drag branch to slice page and pick NY branch only - drag weekendings to columns and choose only Q3 and Q4 of 2003 - drag Markup (calculated measure) to data page - drag client to rows So far, everything's blazingly fast. However, I now drag sales rep to rows to the right of client... and kiss it goodbye. Not even 3 hours later would I see any results. Excel goes mad, taking 100% of CPU and using ram like there is no tomorrow. The same query fired in MDX Sample Application, though sometimes (much) faster, is still far from what one would expect from "precalculated" source. As I said earlier, I tried "everything". I even changed the connection string to Execution Location=3 and Default Isolation Mode=1. I just can't understand why is performance so slow. Is there something wrong in a way I designed the cube or.... or perhaps all of this is normal, is it? Any idea would really be highly appreciated. TIA lc |
#3
| ||||
| ||||
|
|
- a transaction table, also used as fact table. You should use DTS or use some other tool to place the table you want to use as your fact table in another database or preferably another server |
|
Are all of those "columns" used in your cubes? If not, follow first advice and eliminate any columns not needed in your destination |
|
Should I create a copy of this table with only user and key fields in it? A fact table should consist of measures that are commiserate with the granularity of your fact table and surrogate key connections to your dim tables, this is a major performance benefit |
|
I would highly recommend you go out and buy the book by Ralph Kimball called "the data warehouse lifecycle toolkit" Working on cube design without knowing warehouse design can be tough, although very tempting as the cubes can be where you get your business ROI. |
#4
| |||
| |||
|
|
I give up. I tried everything I thought I knew and the damn thing is still way too slow. I would highly appreciate if you someone could give me a few moments of your time to go through this (lengthy) question: Here's what I have: - a transaction table, also used as fact table. The table has about 200 fields. Of those, I use perhaps 30 for linking in dimensions and <10 for measures. Is this a smart thing to do? Should I create a copy of this table with only user and key fields in it? - each transaction record is stamped with a weekending date, sales rep code, client code, contact code, manager code, branch code and type of business (3 elements in corresponding dimension). - I created one dimension for sales rep, one for branch and one for business type - the client dimension has two levels, company name and contact name - finally, I created a time dimension for weekending date The cube is MOLAP and was processed with 50% of optimization. I also tried with 100% with no change in performance. Originally, the cube was done on SQL7 OLAP and I just finished getting disappointed with performance on 2000 OA. I'm sure this must be something I did (or didn't do). Now, O open excel and: - drag branch to slice page and pick NY branch only - drag weekendings to columns and choose only Q3 and Q4 of 2003 - drag Markup (calculated measure) to data page - drag client to rows So far, everything's blazingly fast. However, I now drag sales rep to rows to the right of client... and kiss it goodbye. Not even 3 hours later would I see any results. Excel goes mad, taking 100% of CPU and using ram like there is no tomorrow. The same query fired in MDX Sample Application, though sometimes (much) faster, is still far from what one would expect from "precalculated" source. As I said earlier, I tried "everything". I even changed the connection string to Execution Location=3 and Default Isolation Mode=1. I just can't understand why is performance so slow. Is there something wrong in a way I designed the cube or.... or perhaps all of this is normal, is it? Any idea would really be highly appreciated. TIA lc |
#5
| ||||
| ||||
|
|
I am not sure of all your design choices but I think I can make some general suggestions. |

|
that intersect, and avoid combinations that generate to many empty cells. You have always the option to join separate physical cubes in virtual cubes. |
|
When you crossjoin different dimensions at the leaf/lowest level (put them on the same axis) you create a big performance problem. |
|
Aggregations are always done from the top(ie all member level in each dimension) and then downwards to more detailed levels, depending on your degree of aggregations. I recommend to start with 10-20 percent aggregation. I do not think that aggregation is an issue here. |
#6
| |||
| |||
|
#7
| |||
| |||
|
|
"Thomas Ivarsson" <TI_Nospam (AT) hotmail (DOT) com> wrote in message news:O0IwQ9VoDHA.2964 (AT) tk2msftngp13 (DOT) phx.gbl... I am not sure of all your design choices but I think I can make some general suggestions. Me neither ![]() that intersect, and avoid combinations that generate to many empty cells. You have always the option to join separate physical cubes in virtual cubes. I, sort of, sensed this might be an issue and I went ahead and cut them down to a smaller number. Either way I see it, I have to come up with 12 dimensions. Among these, I can't favor any single one nor can I afford to omit it as all of them are frequently used. While trying to figure out what might be the problem, I removed all dimensions but 5. One Time dimension, two standard one-level (code to code) and two two-level standard dimensions. I still see no visible improvement. One thing is for sure, when dimensions with small number of elements are involved, everything is very fast. Client, however, is a dimension that consists of 10000 elements. When used in any kind of cross-join, client slows down to a crawl. I **am** limiting number of clients to show down to hundreds (sliced by Company and limited to Time Q4 of 2003) but this doesn't seem to speed things up. When you crossjoin different dimensions at the leaf/lowest level (put them on the same axis) you create a big performance problem. And here is where I stand in shock. I though that's what OLAP is all about. I can generate same outputs from my RDB in a nick of time. Why would this be so complicated for OLAP/AS. In concept, it's the same thing: give me this and that where this is in (list) and that > whatever and otherthing.... Is PTS combined with AS a wrong choice? I'm confused. Sorry. Aggregations are always done from the top(ie all member level in each dimension) and then downwards to more detailed levels, depending on your degree of aggregations. I recommend to start with 10-20 percent aggregation. I do not think that aggregation is an issue here. Probably not as I tried with option "when performance gain reaches 100%" and I haven't seen any speed improvement. Please don't anyone get me wrong, I do understand that I'm not familiar with data warehousing theory/practise, but my 20 years with databases in general say something's not right here. Having dimensions that join with fact table on code-to-code - that's all that I did. What kind of theory doesn't lead to this kind of structure. I'm not going for anything obscure - no star schemas, no dynamic dimensions/cubes... nothing fancy; and yet... Thanks for your help and your time. Very appreciated. lc |
#8
| |||
| |||
|
|
Can you post your DDL for the tables you are using for your cubes? In one post you mentioned a transactional table with more than 200 columns, in another you list a fact table with only measures and foreign key references, in another you mention you do not use a star schema but you say your fact table links to the dimension tables by a code, it would help to see the table layouts for the tables you use in your cubing. |
#9
| |||
| |||
|
#10
| |||
| |||
|
![]() |
| Thread Tools | |
| Display Modes | |
| |