dbTalk Databases Forums  

Performance problems incrementally processing a dimension

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss Performance problems incrementally processing a dimension in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Jesse O
 
Posts: n/a

Default Performance problems incrementally processing a dimension - 06-14-2005 , 04:33 PM






We have a dimension with four different levels:

1. Account Manager (about 100 members)
2. Sales Account (5000 members)
3. Campaigns (250,000 members)
4. Ads (1,500,000 members)

Being that we pull transcational data over into our warehouse every hour,
the cube also needs to process every hour (requirement from the business
owners). This poses a problem - to incrementally update this dimension takes
about four minutes every hour. And it's not the only fairly large dimension
we have.

My question is...can I do anything to speed this up dramatically? Within a
given hour, less than 1% of the members change within any of these levels.
We currently use the incremental update processing option on the dimension
and the dimension isn't set for changing (isn't an option because the top
and bottom levels can change).

I tried changing changing the underlying views not to pull over the entire
dataset, but rather only the records which have changed since the last
incremental update. However, Analysis Services saw the records as being
deleted and removed them from the dimension. Anyway to change this option?

I really appreciate and anticipate your guys suggestions. This issue is a
HUGE stumbling block.




Reply With Quote
  #2  
Old   
Dave Wickert [MSFT]
 
Posts: n/a

Default Re: Performance problems incrementally processing a dimension - 06-14-2005 , 06:52 PM






Ref: speeding up an incremental update.

No, not really -- at least on the OLAP-side. You should always expose the
entire dimension table for incremental processing. The first place I would
look to speedup processing is on the SQL-side. From the system-wide
processing log file, or from the interactive processing dialog box, you can
see the SQL statement that is issued for dimension processing. Execute that
interactively and see if you can add indexes; faster joins, etc. normal SQL
kinds of things. For example, you will notice that the SQL is a SELECT
DISTINCT. This means that tempdb comes into play. Possibly moving tempdb to
its own high-speed disk system might assist things. Likewise, look at the
execution plan for the SQL statement -- are there indexes you can add to
speed things up.

Ref: deleting a member.

This is more serious. Deleting members from a dimension requires a full
process. Deletes can also be problematic because you cannot delete a member
if it has fact data associated with it. And deleting fact data requires
either reprocessing the partition it exists in; or deleting the partition
entirely (such as when you roll off the N+1 partition of an N partition
design, e.g. the 37th month if you are keeping a rolling 3 year cycle).

So the net-net is that you are going to have to re-think how you design your
system if deletes are included in your hourly update.

--
Dave Wickert [MSFT]
dwickert (AT) online (DOT) microsoft.com
Program Manager
BI SystemsTeam
SQL BI Product Unit (Analysis Services)
--
This posting is provided "AS IS" with no warranties, and confers no rights.


"Jesse O" <jesperzz (AT) hotmail (DOT) com> wrote

Quote:
We have a dimension with four different levels:

1. Account Manager (about 100 members)
2. Sales Account (5000 members)
3. Campaigns (250,000 members)
4. Ads (1,500,000 members)

Being that we pull transcational data over into our warehouse every hour,
the cube also needs to process every hour (requirement from the business
owners). This poses a problem - to incrementally update this dimension
takes about four minutes every hour. And it's not the only fairly large
dimension we have.

My question is...can I do anything to speed this up dramatically? Within a
given hour, less than 1% of the members change within any of these levels.
We currently use the incremental update processing option on the dimension
and the dimension isn't set for changing (isn't an option because the top
and bottom levels can change).

I tried changing changing the underlying views not to pull over the entire
dataset, but rather only the records which have changed since the last
incremental update. However, Analysis Services saw the records as being
deleted and removed them from the dimension. Anyway to change this option?

I really appreciate and anticipate your guys suggestions. This issue is a
HUGE stumbling block.






Reply With Quote
  #3  
Old   
Jesse O
 
Posts: n/a

Default Re: Performance problems incrementally processing a dimension - 06-14-2005 , 07:26 PM



Thanks for the fast update as usual Dave.

I had a feeling there wasn't much to be done on the OLAP side. Oh well. I'll
have to look further into the SQL side.

As far as deletes, we don't do 'em. They're bad in a data warehouse!


"Dave Wickert [MSFT]" <dwickert (AT) online (DOT) microsoft.com> wrote

Quote:
Ref: speeding up an incremental update.

No, not really -- at least on the OLAP-side. You should always expose the
entire dimension table for incremental processing. The first place I would
look to speedup processing is on the SQL-side. From the system-wide
processing log file, or from the interactive processing dialog box, you
can see the SQL statement that is issued for dimension processing. Execute
that interactively and see if you can add indexes; faster joins, etc.
normal SQL kinds of things. For example, you will notice that the SQL is a
SELECT DISTINCT. This means that tempdb comes into play. Possibly moving
tempdb to its own high-speed disk system might assist things. Likewise,
look at the execution plan for the SQL statement -- are there indexes you
can add to speed things up.

Ref: deleting a member.

This is more serious. Deleting members from a dimension requires a full
process. Deletes can also be problematic because you cannot delete a
member if it has fact data associated with it. And deleting fact data
requires either reprocessing the partition it exists in; or deleting the
partition entirely (such as when you roll off the N+1 partition of an N
partition design, e.g. the 37th month if you are keeping a rolling 3 year
cycle).

So the net-net is that you are going to have to re-think how you design
your system if deletes are included in your hourly update.

--
Dave Wickert [MSFT]
dwickert (AT) online (DOT) microsoft.com
Program Manager
BI SystemsTeam
SQL BI Product Unit (Analysis Services)
--
This posting is provided "AS IS" with no warranties, and confers no
rights.


"Jesse O" <jesperzz (AT) hotmail (DOT) com> wrote in message
news:OwrpWiScFHA.720 (AT) TK2MSFTNGP15 (DOT) phx.gbl...
We have a dimension with four different levels:

1. Account Manager (about 100 members)
2. Sales Account (5000 members)
3. Campaigns (250,000 members)
4. Ads (1,500,000 members)

Being that we pull transcational data over into our warehouse every hour,
the cube also needs to process every hour (requirement from the business
owners). This poses a problem - to incrementally update this dimension
takes about four minutes every hour. And it's not the only fairly large
dimension we have.

My question is...can I do anything to speed this up dramatically? Within
a given hour, less than 1% of the members change within any of these
levels. We currently use the incremental update processing option on the
dimension and the dimension isn't set for changing (isn't an option
because the top and bottom levels can change).

I tried changing changing the underlying views not to pull over the
entire dataset, but rather only the records which have changed since the
last incremental update. However, Analysis Services saw the records as
being deleted and removed them from the dimension. Anyway to change this
option?

I really appreciate and anticipate your guys suggestions. This issue is a
HUGE stumbling block.








Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.