Count distinct on non-zero items -
01-11-2004
, 05:24 PM
Hi,
I'm having some trouble configuring a cube to achieve a hairy count
distinct problem. I have a fact table with three attributes per tuple:
a customer ID (primary key), a date, and a decimal x. So far I have
two dimensions, ID and date and measure x with aggregate function Sum.
This all works very well; Analysis Services makes it very easy.
The problem is that I can't work out how to count the number of
distinct customers who have an x > 0.0 such that the aggregates will
work over any of the dimensions and still return good data.
I have found a number of ways of doing it so far, none of which quite
work.
* I can make a view which adds a new column, xa, where xa = 1 iff x >
0.0. Then I can sum the xa column to find the number of customers but
this only works (because of the frequency of data collection) across
month aggregations and not years.
* Alternatively I can have a view which adds a new column, xb, where
xb = ID iff x > 0.0 and add xb as a new count distinct measure. This
has two drawbacks, first that the count distinct will be off by one
sometimes (as zero is a value in itself) and second that I have about
twenty columns like x in the actual fact table and having twenty count
distinct columns doesn't sound like a good idea even with virtual
cubes. (Using virtual cubes there is also the issue of having to share
dimensions if I want them to join properly.)
Can anyone suggest a good way of solving this problem? Much TIA.
--
Tom Rowlands
<mailto:tom (AT) ltc (DOT) act.edu.au.>
(Sorry, I can't sign this.) |