dbTalk Databases Forums  

Change 'order by' in processing cube

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss Change 'order by' in processing cube in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Dave A
 
Posts: n/a

Default Change 'order by' in processing cube - 06-02-2005 , 11:55 PM






I have a cube whose processing is getting a bit slow. There are 41000000
records.

When I process the cube the generation of the dimensions is very quick
(relatively) since I have placed indexes over the table that correlate to
the dimensions.

It is when it does the last step - the 'select [all column names] from
mytable order by [some column name]' that is where the problem lies. The
'order by' is being performed on the first column for some reason. This is a
really bad column since there is no index on it and even if there was I am
not sure that it would help that much.

The execution plan reveals that the bottleneck is on the 'order by'. This is
not surprising since there is so much data.

How can I tell the cube to use another column or preferably have no 'order
by' at all? There seems to be no property that control this.

Regards
Dave A



Reply With Quote
  #2  
Old   
Dave Wickert [MSFT]
 
Posts: n/a

Default Re: Change 'order by' in processing cube - 06-03-2005 , 01:15 AM






You have a distinct count measure, don't you?
--
Dave Wickert [MSFT]
dwickert (AT) online (DOT) microsoft.com
Program Manager
BI SystemsTeam
SQL BI Product Unit (Analysis Services)
--
This posting is provided "AS IS" with no warranties, and confers no rights.


"Dave A" <dave (AT) sigmasolutionsdonotspamme (DOT) com.au> wrote

Quote:
I have a cube whose processing is getting a bit slow. There are 41000000
records.

When I process the cube the generation of the dimensions is very quick
(relatively) since I have placed indexes over the table that correlate to
the dimensions.

It is when it does the last step - the 'select [all column names] from
mytable order by [some column name]' that is where the problem lies. The
'order by' is being performed on the first column for some reason. This is
a really bad column since there is no index on it and even if there was I
am not sure that it would help that much.

The execution plan reveals that the bottleneck is on the 'order by'. This
is not surprising since there is so much data.

How can I tell the cube to use another column or preferably have no 'order
by' at all? There seems to be no property that control this.

Regards
Dave A




Reply With Quote
  #3  
Old   
Dave A
 
Posts: n/a

Default Re: Change 'order by' in processing cube - 06-03-2005 , 02:56 AM



Dave,

OK - I see your point.

This seems strange behaviour though. Why does the data need to be ordered
to calculate a distinct count? I understand the algorithm that you would
have used and I can also see why this algorithm imposes the 'one distinct
count per cube' limitation.

If you had used an algorithm that did not rely on the data being sorted then
you would not have the limitation and my cube would calculate MUCH faster.

Has this been addressed in 2005?

I have a process (made up of several steps) that starts at midnight and it
needs to be finished at 8:00am. At the moment the slowest step is the
calculation of this cube. It takes about 3 hours and the process is
extending beyond the 8:00am curfew.

Regards
Dave A





"Dave Wickert [MSFT]" <dwickert (AT) online (DOT) microsoft.com> wrote

Quote:
You have a distinct count measure, don't you?
--
Dave Wickert [MSFT]
dwickert (AT) online (DOT) microsoft.com
Program Manager
BI SystemsTeam
SQL BI Product Unit (Analysis Services)
--
This posting is provided "AS IS" with no warranties, and confers no
rights.


"Dave A" <dave (AT) sigmasolutionsdonotspamme (DOT) com.au> wrote in message
news:eUoLvh$ZFHA.1404 (AT) TK2MSFTNGP09 (DOT) phx.gbl...
I have a cube whose processing is getting a bit slow. There are 41000000
records.

When I process the cube the generation of the dimensions is very quick
(relatively) since I have placed indexes over the table that correlate to
the dimensions.

It is when it does the last step - the 'select [all column names] from
mytable order by [some column name]' that is where the problem lies. The
'order by' is being performed on the first column for some reason. This
is a really bad column since there is no index on it and even if there
was I am not sure that it would help that much.

The execution plan reveals that the bottleneck is on the 'order by'. This
is not surprising since there is so much data.

How can I tell the cube to use another column or preferably have no
'order by' at all? There seems to be no property that control this.

Regards
Dave A






Reply With Quote
  #4  
Old   
Dave Wickert [MSFT]
 
Posts: n/a

Default Re: Change 'order by' in processing cube - 06-04-2005 , 01:30 AM



A best practice is to separate your distinct count to a separate physical
cube. Join the two cubes together in a virtual cube. The original cube
processes much faster and so does the distinct count cube. The same
algorithm is used in 2005.
--
Dave Wickert [MSFT]
dwickert (AT) online (DOT) microsoft.com
Program Manager
BI SystemsTeam
SQL BI Product Unit (Analysis Services)
--
This posting is provided "AS IS" with no warranties, and confers no rights.


"Dave A" <dave (AT) sigmasolutionsdonotspamme (DOT) com.au> wrote

Quote:
Dave,

OK - I see your point.

This seems strange behaviour though. Why does the data need to be ordered
to calculate a distinct count? I understand the algorithm that you would
have used and I can also see why this algorithm imposes the 'one distinct
count per cube' limitation.

If you had used an algorithm that did not rely on the data being sorted
then you would not have the limitation and my cube would calculate MUCH
faster.

Has this been addressed in 2005?

I have a process (made up of several steps) that starts at midnight and it
needs to be finished at 8:00am. At the moment the slowest step is the
calculation of this cube. It takes about 3 hours and the process is
extending beyond the 8:00am curfew.

Regards
Dave A





"Dave Wickert [MSFT]" <dwickert (AT) online (DOT) microsoft.com> wrote in message
news:u7JP8OAaFHA.1368 (AT) tk2msftngp13 (DOT) phx.gbl...
You have a distinct count measure, don't you?
--
Dave Wickert [MSFT]
dwickert (AT) online (DOT) microsoft.com
Program Manager
BI SystemsTeam
SQL BI Product Unit (Analysis Services)
--
This posting is provided "AS IS" with no warranties, and confers no
rights.


"Dave A" <dave (AT) sigmasolutionsdonotspamme (DOT) com.au> wrote in message
news:eUoLvh$ZFHA.1404 (AT) TK2MSFTNGP09 (DOT) phx.gbl...
I have a cube whose processing is getting a bit slow. There are 41000000
records.

When I process the cube the generation of the dimensions is very quick
(relatively) since I have placed indexes over the table that correlate
to the dimensions.

It is when it does the last step - the 'select [all column names] from
mytable order by [some column name]' that is where the problem lies.
The 'order by' is being performed on the first column for some reason.
This is a really bad column since there is no index on it and even if
there was I am not sure that it would help that much.

The execution plan reveals that the bottleneck is on the 'order by'.
This is not surprising since there is so much data.

How can I tell the cube to use another column or preferably have no
'order by' at all? There seems to be no property that control this.

Regards
Dave A








Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.