dbTalk Databases Forums  

Count disinct on non-zero items

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss Count disinct on non-zero items in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Tom Rowlands
 
Posts: n/a

Default Count disinct on non-zero items - 01-11-2004 , 05:19 PM






Hi,

I'm having some trouble configuring a cube to achieve a hairy count
distinct problem. I have a fact table with three attributes per tuple:
a customer ID (primary key), a date, and a decimal x. So far I have
two dimensions, ID and date and measure x with aggregate function Sum.
This all works very well; Analysis Services makes it very easy.

The problem is that I can't work out how to count the number of
distinct customers who have an x > 0.0 such that the aggregates will
work over any of the dimensions and still return good data.

I have found a number of ways of doing it so far, none of which quite
work.

* I can make a view which adds a new column, xa, where xa = 1 iff x >
0.0. Then I can sum the xa column to find the number of customers but
this only works (because of the frequency of data collection) across
month aggregations and not years.

* Alternatively I can have a view which adds a new column, xb, where
xb = ID iff x > 0.0 and add xb as a new count distinct measure. This
has two drawbacks, first that the count distinct will be off by one
sometimes (as zero is a value in itself) and second that I have about
twenty columns like x in the actual fact table and having twenty count
distinct columns doesn't sound like a good idea even with virtual
cubes. (Using virtual cubes there is also the issue of having to share
dimensions if I want them to join properly.)

Can anyone suggest a good way of solving this problem? Much TIA.

--

Tom Rowlands
<mailto:tom (AT) ltc (DOT) act.edu.au.>
(Sorry, I can't sign this.)

Reply With Quote
  #2  
Old   
Deepak Puri
 
Posts: n/a

Default Re: Count disinct on non-zero items - 01-11-2004 , 08:03 PM






A typical approach would be to define a fact table view that only
includes rows where x > 0.0, in which case the distinct count would not
be off by 1. But if you have 20 x columns, each with a separate
threshold, then this would entail 20 views and 20 physical cubes in a
virtual cube.


- Deepak

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.