dbTalk Databases Forums  

DISTINCT COUNT performance and Snapshot Query?

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss DISTINCT COUNT performance and Snapshot Query? in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Karl Kvool
 
Posts: n/a

Default DISTINCT COUNT performance and Snapshot Query? - 03-22-2006 , 05:07 PM






Using SQL Server AS 2000...I have read this document probably 1000 times
because I am constantly struggling with performance with counting distinct
customers in large data warehouses, especially when the result set is sliced
by multiple dimensions.

http://msdn.microsoft.com/library/de.../distinct2.asp

What normally happens with a distinct cube (the normal solution to the
distinct count problem), is the two slices get evaluated separately, and then
added together at the end, which just about doubles the result of the
distinct count measure, and renders it invalid. In this document it explains
how to accomplish this distinct counting using a calculated measure which
cross-joins a measure with a set, but the performance hit is unbearable (5-10
minutes on a very beefy server) if you have more than a couple thousand
records in your fact table. It gets really bad if you slice by ranges of
dimensions (for example, service date range of 6 months and paid date range
of a different 6 months, to determine the number of unique customers that
were serviced and then paid in those two date ranges).

Anyway, at the bottom of the document I reference above, for performance
conderations it talks about how DISTINCT COUNT queries should always be
snapshot queries. But try to find any other reference to what that means
anywhere else in MSDN. What are they talking about? It also talks about
server vs. client side query execution. The only reference I can find for
this is a property on the provider string. However, when I set this to
server-side execution, it didn't make a difference in the response time, and
also it didn't affect which process was using the CPU (which indicated to me
that the client was still resolving the counts).

Does anyone have any more insight into these performance considerations
(snapshot, server-side execution) for DISTINCT COUNT calculations ? I would
welcome any input to help me resolve this issue which has been vexing me for
years.

Thanks!

Reply With Quote
  #2  
Old   
Karl Kvool
 
Posts: n/a

Default RE: DISTINCT COUNT performance and Snapshot Query? - 03-23-2006 , 02:49 PM






I did solve the remote execution problem by adding two additional values to
my connection string. However, this did nothing to improve the performance
problems. Any ideas or suggestions are greatly appreciated.

"Karl Kvool" wrote:

Quote:
Using SQL Server AS 2000...I have read this document probably 1000 times
because I am constantly struggling with performance with counting distinct
customers in large data warehouses, especially when the result set is sliced
by multiple dimensions.

http://msdn.microsoft.com/library/de.../distinct2.asp

What normally happens with a distinct cube (the normal solution to the
distinct count problem), is the two slices get evaluated separately, and then
added together at the end, which just about doubles the result of the
distinct count measure, and renders it invalid. In this document it explains
how to accomplish this distinct counting using a calculated measure which
cross-joins a measure with a set, but the performance hit is unbearable (5-10
minutes on a very beefy server) if you have more than a couple thousand
records in your fact table. It gets really bad if you slice by ranges of
dimensions (for example, service date range of 6 months and paid date range
of a different 6 months, to determine the number of unique customers that
were serviced and then paid in those two date ranges).

Anyway, at the bottom of the document I reference above, for performance
conderations it talks about how DISTINCT COUNT queries should always be
snapshot queries. But try to find any other reference to what that means
anywhere else in MSDN. What are they talking about? It also talks about
server vs. client side query execution. The only reference I can find for
this is a property on the provider string. However, when I set this to
server-side execution, it didn't make a difference in the response time, and
also it didn't affect which process was using the CPU (which indicated to me
that the client was still resolving the counts).

Does anyone have any more insight into these performance considerations
(snapshot, server-side execution) for DISTINCT COUNT calculations ? I would
welcome any input to help me resolve this issue which has been vexing me for
years.

Thanks!

Reply With Quote
  #3  
Old   
Deepak Puri
 
Posts: n/a

Default Re: DISTINCT COUNT performance and Snapshot Query? - 03-24-2006 , 11:37 PM



Just curious why you wouldn't consider AS 2005, since it natively
supports Distinct Count across selections of multiple members (I assume
that's what you meant by "..the two slices get evaluated separately..",
since the AS 2000 Distinct Count measure can handle single selections on
each of multiple dimensions).


- Deepak

Deepak Puri
Microsoft MVP - SQL Server

*** Sent via Developersdex http://www.developersdex.com ***

Reply With Quote
  #4  
Old   
Karl Kvool
 
Posts: n/a

Default Re: DISTINCT COUNT performance and Snapshot Query? - 03-28-2006 , 01:26 PM



Deepak,

Actually, that will be my ultimate solution if I cannot accomplish this with
AS 2000. However, the front-end that I use does not yet support 2005 and it
is not clear when it will, so I was still holding out hope that there was a
workable way to do this under 2000.

Thanks,

-Karl

"Deepak Puri" wrote:

Quote:
Just curious why you wouldn't consider AS 2005, since it natively
supports Distinct Count across selections of multiple members (I assume
that's what you meant by "..the two slices get evaluated separately..",
since the AS 2000 Distinct Count measure can handle single selections on
each of multiple dimensions).


- Deepak

Deepak Puri
Microsoft MVP - SQL Server

*** Sent via Developersdex http://www.developersdex.com ***


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.