dbTalk Databases Forums  

Need help with cube design please

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss Need help with cube design please in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #11  
Old   
lc
 
Posts: n/a

Default Re: Need help with cube design please - 11-02-2003 , 07:45 PM







"Ray Higdon" <rayhigdon (AT) higdonconsulting (DOT) com> wrote

Quote:
From the DDL I assume you are using your application (Natural) keys to
link from the fact table to the dim tables. You would get a performance
boost if you took it one more step and used all surrogate (identity)
keys to link instead of char columns.
I realize chars are not that "popular" choice (took DB over and couldn't
change them). I suspect integers are acceptable, right?

Quote:
This is not that bad to do, you
use what you have as a staging area and then generate your dim tables
with the same info but with an integer identity and store that linking
integer column in your fact table.
If I read this correctly, the design that I have, besides char PK's, is what
a "normal" cube should look like.

Now, just one stupid question: do you think that integers over chars would
really mean that kind of performance boost (minutes over hours)? Is
cross-join of 2 one-thousand dimensions supposed to be slow and how
(non-withstanding hardware). I'm asking because I honestly don't know;
haven't seen what other people have or use. I base my expectations purely on
my rdb experience.

Thanks once again for your time.

lc




Reply With Quote
  #12  
Old   
Ray Higdon
 
Posts: n/a

Default Re: Need help with cube design please - 11-02-2003 , 08:05 PM






Yes, integers are preferred, It's probably not going to make that much of a
difference, just curious, do you have the latest service pack for Analysis
Services installed? With the numbers and DDL you have provided, you should
not be experiencing that poor of performance. There is a good article
(pretty lengthy though)on MS that may help as well.

http://www.microsoft.com/technet/tre...e/AnSvcsPG.asp

--
Ray Higdon MCSE, MCDBA, CCNA
---
"lc" <lcdata (AT) hotmail (DOT) com> wrote

Quote:
"Ray Higdon" <rayhigdon (AT) higdonconsulting (DOT) com> wrote in message
news:O7cTheaoDHA.1020 (AT) TK2MSFTNGP09 (DOT) phx.gbl...
From the DDL I assume you are using your application (Natural) keys to
link from the fact table to the dim tables. You would get a performance
boost if you took it one more step and used all surrogate (identity)
keys to link instead of char columns.

I realize chars are not that "popular" choice (took DB over and couldn't
change them). I suspect integers are acceptable, right?

This is not that bad to do, you
use what you have as a staging area and then generate your dim tables
with the same info but with an integer identity and store that linking
integer column in your fact table.

If I read this correctly, the design that I have, besides char PK's, is
what
a "normal" cube should look like.

Now, just one stupid question: do you think that integers over chars would
really mean that kind of performance boost (minutes over hours)? Is
cross-join of 2 one-thousand dimensions supposed to be slow and how
(non-withstanding hardware). I'm asking because I honestly don't know;
haven't seen what other people have or use. I base my expectations purely
on
my rdb experience.

Thanks once again for your time.

lc





Reply With Quote
  #13  
Old   
lc
 
Posts: n/a

Default Re: Need help with cube design please - 11-02-2003 , 08:35 PM




"Ray Higdon" <rayhigdon (AT) higdonconsulting (DOT) com> wrote

Quote:
Yes, integers are preferred, It's probably not going to make that much of
a
difference, just curious, do you have the latest service pack for Analysis
Services installed? With the numbers and DDL you have provided, you should
not be experiencing that poor of performance.
You know, I'm (no longer) sure. Too many installations/patches lately. I'll
just do it (again); just in case.

Quote:
There is a good article
(pretty lengthy though)on MS that may help as well.
I *love* lengthy articles Thanks a lot for your patience.

lc




Reply With Quote
  #14  
Old   
Thomas Ivarsson
 
Posts: n/a

Default Re: Need help with cube design please - 11-03-2003 , 09:46 AM



Have a look at my comments below.
Regards
Thomas Ivarsson

"lc" <lcdata (AT) hotmail (DOT) com> wrote

Quote:
"Thomas Ivarsson" <TI_Nospam (AT) hotmail (DOT) com> wrote in message
news:O0IwQ9VoDHA.2964 (AT) tk2msftngp13 (DOT) phx.gbl...
I am not sure of all your design choices but I think I can make some
general
suggestions.

Me neither

that intersect, and avoid combinations that generate to many empty
cells.
You have always the option to join separate physical cubes in virtual
cubes.

I, sort of, sensed this might be an issue and I went ahead and cut them
down
to a smaller number. Either way I see it, I have to come up with 12
dimensions. Among these, I can't favor any single one nor can I afford to
omit it as all of them are frequently used.

While trying to figure out what might be the problem, I removed all
dimensions but 5. One Time dimension, two standard one-level (code to
code)
and two two-level standard dimensions. I still see no visible improvement.
One thing is for sure, when dimensions with small number of elements are
involved, everything is very fast. Client, however, is a dimension that
consists of 10000 elements. When used in any kind of cross-join, client
slows down to a crawl. I **am** limiting number of clients to show down to
hundreds (sliced by Company and limited to Time Q4 of 2003) but this
doesn't
seem to speed things up.

TI: Do you have a lot of member properties at the lowest level? Do you use
a lot of calculated members?

Quote:
When you crossjoin different dimensions at the leaf/lowest level (put
them
on the same axis) you create a big performance problem.

And here is where I stand in shock. I though that's what OLAP is all
about.
I can generate same outputs from my RDB in a nick of time. Why would this
be
so complicated for OLAP/AS. In concept, it's the same thing: give me this
and that where this is in (list) and that > whatever and otherthing....
Is PTS combined with AS a wrong choice? I'm confused. Sorry.
TI: No OLAP is most about aggregations. The engine creates levels above the
leaf level in each dimension.
When you crossjoin at the leaf level you combine all the members in the
crossjoined dimension. So with two dimension
of 1000 leaf members in each you will get 1000000 combinations. A solution
is to use the nonemptycrossjoin function i an MDX-statement.
Quote:
Aggregations are always done from the top(ie all member level in each
dimension) and then downwards to more detailed levels, depending on your
degree of aggregations. I recommend to start with 10-20 percent
aggregation.
I do not think that aggregation is an issue here.

Probably not as I tried with option "when performance gain reaches 100%"
and
I haven't seen any speed improvement.

Please don't anyone get me wrong, I do understand that I'm not familiar
with
data warehousing theory/practise, but my 20 years with databases in
general
say something's not right here. Having dimensions that join with fact
table
on code-to-code - that's all that I did. What kind of theory doesn't lead
to
this kind of structure. I'm not going for anything obscure - no star
schemas, no dynamic dimensions/cubes... nothing fancy; and yet...

Thanks for your help and your time. Very appreciated.

lc





Reply With Quote
  #15  
Old   
lc
 
Posts: n/a

Default Re: Need help with cube design please - 11-03-2003 , 11:05 AM




"Thomas Ivarsson" <TI_Nospam (AT) hotmail (DOT) com> wrote

Quote:
TI: Do you have a lot of member properties at the lowest level? Do you
use
a lot of calculated members?
I had 2 properties at Contact level, which was second level to Client. While
Client count was/is at around 11000, Contact is/was at 25000.

I have only one calculated member. Now that I see I'll have to put my
original data through a (serious) massage, I'll create a field for it and
store the actual value instead of calculating it.

Quote:
TI: No OLAP is most about aggregations. The engine creates levels above
the
leaf level in each dimension.
When you crossjoin at the leaf level you combine all the members in the
crossjoined dimension. So with two dimension
of 1000 leaf members in each you will get 1000000 combinations. A
solution
is to use the nonemptycrossjoin function i an MDX-statement.
PT services _do_ use nonempty by default. At least that's what I picked up
from the log file.

Thanks a lot Thomas.

lc




Reply With Quote
  #16  
Old   
J O Holloway
 
Posts: n/a

Default Re: Need help with cube design please - 11-05-2003 , 05:30 PM



Don't know if it helps you, but in this small example data warehouse
that I built (model below; just a proof-of-concept for mgt., but uses
real data), the processing time for a full cube is under 5 minutes
(single 2.6 GHz processor, 1 GB RAM). I realize it's not as large as
even your chopped model with only five dimensions, but still .. hours
of processing for your model seems a bit out of bounds.

Barring any other suggestions that make more sense, you might try
dropping to just three dimensions, and if that's still slow, then try
(perhaps from a copy database, if available) changing your keys to use
integers. I haven't run the experiment (of using natural keys)
because Kimball was so adamantly against that in his book (Data
Warehouse Toolkit).

Best of luck.

Quote:
|||||||||||||||||||||||||||||||||||||||||||||||||| |


My Fact Table - around 500,000 records
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^
date_key integer
product_key integer
branch_key integer
sales_slip_key integer {degenerate dim.}
quantity_sold integer
cost smallmoney
price smallmoney
gross_profit smallmoney {calculated}


CalendarDates - 1429 elements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^
date_key integer
the_date
day_of_week
calendar_month
calendar_quarter
calendar_year


Products - 4983 elements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^
product_key integer
identifier
description
department



Branches - 53 elements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^
branch_key integer
branch_number
city
region


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.