dbTalk Databases Forums  

Calculated Measures for selective records

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss Calculated Measures for selective records in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Stefan Farthofer
 
Posts: n/a

Default Calculated Measures for selective records - 09-02-2004 , 04:53 AM






Hi,

I was wondering what the best way for implementing calculated measures which
only operate on certain records is.

The problem I am trying to solve is, that not all records have the same data
quality. For example, some measures could be missing for various reasons.
Now if some calculated measure computes relations such as
[Measures].[ApproximationQuality] = [Measures].[RealCost] /
[Measures].[ApproximateCost], one would only really want to do that for the
aggregations of all records that do have both values, [Measures].[RealCost]
and [Measures].[ApproximateCost], non null.

I can think of a few ways how one may achieve the goal:

1) you could use a Filter(...) to to get all records that do have both
fields entered.

This could get a little messy since you would need to find an appropriate
Dimension from which you can filter and probably will make simple formulas
very complicated, since I imagine you'd have to use several filters in one
calculated measure. On the other hand you could use this for as many
different combinations of required fields as you wish without knowing
anything in advance.

2) you could introduce a kind of DataQuality dimension and define your
calculated measures like ([DataQuality].[Good],[Measures].[RealCost]) /
([DataQuality].[Good],[Measures].[ApproximateCost]).

The problem I see here is that if one has several calculated measures which
work with different base measures, this dimension could become quite tricky
to implement, as you would have to account for the fact that different
caclulations require different measures to be present. This could also be a
problem if you add calculated measures on a regular basis as you would most
certainly need to update the DataQuality dimension too, so it is not as easy
to expand as 1).

3) another solution I have seen is the addition of new measures for the
values you need in your calculation. These get only filled if data for all
the other fields needed is also present. e.g. you introduce
[Measures].[RealCostAQ] and [Measures].[AproximateCostAQ] and only fill
those fields, if both [Measures].[RealCost] and [Measures].[ApproximateCost]
are present. This can, for example, be achieved quiet easily by using a view
as your fact table.

This solutions looks a little bit wacky, and you may easily wind up having
quite a few redundant measures if you have several complex calculated
measures. As with 2) you would have to change the structure of your cubes to
add new calculated measures.

4) The last thing I came up with is a seperate cube that only holds records
of higher quality, so one could define all the calculated measures and not
worry about missing data. This is, strictly speaking, not a valid solution
for the problem, but could non the less be a viable option, if the overhead
of the above solutions prove to be unacceptable.


How do you usually solve such problems? Any comments on the pros and cons
of the solutions mentioned here? Any better ideas / anything I have missed
regarding the issue?

best regards
Stefan Farthofer



Reply With Quote
  #2  
Old   
Stefan Farthofer
 
Posts: n/a

Default Re: Calculated Measures for selective records - 09-02-2004 , 09:05 AM






Hi Jamie,

Quote:
If I understand your problem correctly (unlikely knowing me), a derivation
of the following should solve your problem:
Thanks for the quick reply, Jamie. I should have included examples of what I
meant, my descriptions tend to be a little bit rushed and chaotic. So here
follow examples for the problem and the (possible) solutions:

Assume the following data (way simplified):

[ApproximateCost] [RealCost]
[Store].[USA] 12000 14000
[Store].[Europe] null 300000
[Store].[Asia] 2000 3000

now assume all members have sum as their aggregation function and compute
the following query:

WITH MEMBER [Measures].[ApproximationQuality] AS
'[Measures].[RealCost] / [Measures].[ApproximateCost]'
select {[Measures].[ApproximationQuality]} on AXIS[0]
{[Store].[All Stores]} on AXIS[1]
from CostData


what you get will be ([Store].[All Stores],[Measures].[RealCost]) /
([Store].[All Stores],[Measures].[ApproximateCost])
which amounts to
(14000+300000+3000) / (12000+2000) = 22.64

Now that is not what one would expect from the measure ApproximationQuality.
The "right" way (at least what I intended) is to perform the calculation
ignoring [Store].[Europe], since the [ApproximateCost] is missing there and
instead compute:
(14000+3000)/(12000+2000) = 1.21


Your proposal

Quote:
WITH MEMBER [Measures].[ApproximationQuality] AS
'IIF([Measures].[ApproximateCost] IS NULL OR
IsEmpty([Measures].[ApproximateCost]), 0, [Measures].[RealCost] /
[Measures].[ApproximateCost])'
does not work either, as a simple null check will not help here as neither
([Store].[All Stores],[Measures].[RealCost]) nor ([Store].[All
Stores],[Measures].[ApproximateCost]) are null.

My question is whether someone has encountered that problem and how they
dealt with it and what people think about the ways I came up with / have see
n for solving that problem.

Here are the missing examples to my proposed solutions, for a more abstract
description and my perceived pros and cons see my original post...

1) Using a filter to do the same query like

WITH MEMBER [Measures].[ApproximationQuality] AS
'sum(crossjoin({filter([Store].currentmember.children,
([Store].currentmember, [Measures].[RealCost] is not null) and
([Store].currentmember, [Measures].[ApproximateCost] is not
null)},{[Measures].[RealCost]})
/
sum(crossjoin({filter([Store].currentmember.children,
([Store].currentmember, [Measures].[RealCost] is not null) and
([Store].currentmember, [Measures].[ApproximateCost] is not
null)},{[Measures].[ApproximateCost]})'
select {[Measures].[ApproximationQuality]} on AXIS[0]
{[Store].[All Stores]} on AXIS[1]
from CostData

I know that the filter() statements will need some work to be really useful
as you'd probably want to get down to leave level, but I think this simple
version is enough to show the idea behind this way of solving the problem.

2) Introduce a dimension [DataQuality] ([DataQuality].[Good],
[DataQualityBad]) such that all fact table entries which have both
[RealCost] and [ApproximateCost] filled out get [DataQuality].[Good] and all
other get [DataQuality].[Bad].

Assuming that dimension, the query would look like
WITH MEMBER [Measures].[ApproximationQuality] AS
'([DataQuality].[Good],[Measures].[RealCost]) /
([DataQuality].[Good],[Measures].[ApproximateCost])'
select {[Measures].[ApproximationQuality]} on AXIS[0]
{[Store].[All Stores]} on AXIS[1]
from CostData

3) Introduce two new measures [Measures].[RealCostAQ] and
[Measures].[AproximateCostAQ]. These would get filled in the ETL scripts
like this:
RealCostAQ = case when RealCost is not null and
ApproximateCost is not null RealCost else null end
ApproximateCostAQ = case when RealCost is not null and
ApproximateCost is not null ApproximateCost else null end
This way our original query could be rewritten to

WITH MEMBER [Measures].[ApproximationQuality] AS
'[Measures].[RealCostAQ] / [Measures].[ApproximateCostAQ]'
select {[Measures].[ApproximationQuality]} on AXIS[0]
{[Store].[All Stores]} on AXIS[1]
from CostData

and produce the desired result.

4) Define a new cube and insert only records into its fact table, that have
RealCost and ApproximateCost non null.

Hope that helps to clarify my first post, please tell me if anything is
unclear, missing or wrong.

On a final note, thanks for reading all the way, these are rather long
posts, but I feel that coming up with some potential solutions is a better
base to start a discussion from (or rule out all at once .

PS that post made me wish newsreaders had syntax highlightning...

regards
Stefan Farthofer




Reply With Quote
  #3  
Old   
Stefan Farthofer
 
Posts: n/a

Default Re: Calculated Measures for selective records - 09-10-2004 , 02:38 AM



"Jamie" <Jamie (AT) discussions (DOT) microsoft.com> schrieb im Newsbeitrag
news:C7B9FB46-D971-40E4-A201-25E3E8DCEAC8 (AT) microsoft (DOT) com...
Quote:
Personally speaking Stefan it sounds as though option 1) would be your
best
option.
Yes it would look unweildy but that is of lesser concern. What is more
important (IMO) is that it wouldn't require restructuring your cube.
Thats my opinion for what its worth. Let us know what you decide to go
with.

Since I had to restructure the cube anyway, I temporarily settled with the
additional table fields, as I don't have the time to change the solution if
it turns out to be infeasible complexity or performance wise. When I get to
it, I will do a test with the first 3 options. I'll post the results on this
NG, if something interesting comes out.

Thanks again for your advice,
Stefan Farthofer




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.