dbTalk Databases Forums  

design for distinct count

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss design for distinct count in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
raymond via SQLMonster.com
 
Posts: n/a

Default design for distinct count - 02-25-2005 , 06:01 AM






Does a distinct count always have to be on the key of a dimension - the
column that acts as foreign key on the fact?

If I need a distinct count on 2 levels of a dimension (NOT snowflaked,
since i am already using 4-5 levels of snowflaking to build the dim), can I
have both the levels related as columns on the fact and use distinct count
aggregate?

Dimension table
level1, keyforlevel1, level2, keyforlevel2 and level3, keyforlevel3
keyforlevel3 - is the primary key for this table

Fact table
keyforlevel1, keyforlevel2 , keyforlevel3
keyforlevel3 is the foreign key relating to the dim key.

Then i use this same dim to build 2 cubes - one that distinct counts
fact.keyforlevel1 and another fact.keyforlevel2

Seems to work, but I am not sure this is the way to implement this.
Can someone please comment on
1. will this work - it might be bad design, but will it work?
2. better design?

TIA

--
Message posted via http://www.sqlmonster.com

Reply With Quote
  #2  
Old   
Ohjoo Kwon
 
Posts: n/a

Default Re: design for distinct count - 02-26-2005 , 12:10 PM






I understand you want to use one fact table containing every level key as a
measure becuase every measure must come from fact table.

I guess your design works well. If I add something, things below could be
considered more.

First, fact table contains only keyforlevel3 and you use views for each
cube.

Second, use disabled level property of the dimension in each cube for less
disk space and better performance. For example, the cube for keyforlevel1
distinct count dose not need level2 and level3.

Ohjoo Kwon
www.olapforum.com


"raymond via SQLMonster.com" <forum (AT) SQLMonster (DOT) com> wrote

Quote:
Does a distinct count always have to be on the key of a dimension - the
column that acts as foreign key on the fact?

If I need a distinct count on 2 levels of a dimension (NOT snowflaked,
since i am already using 4-5 levels of snowflaking to build the dim), can
I
have both the levels related as columns on the fact and use distinct count
aggregate?

Dimension table
level1, keyforlevel1, level2, keyforlevel2 and level3, keyforlevel3
keyforlevel3 - is the primary key for this table

Fact table
keyforlevel1, keyforlevel2 , keyforlevel3
keyforlevel3 is the foreign key relating to the dim key.

Then i use this same dim to build 2 cubes - one that distinct counts
fact.keyforlevel1 and another fact.keyforlevel2

Seems to work, but I am not sure this is the way to implement this.
Can someone please comment on
1. will this work - it might be bad design, but will it work?
2. better design?

TIA

--
Message posted via http://www.sqlmonster.com



Reply With Quote
  #3  
Old   
rama via SQLMonster.com
 
Posts: n/a

Default Re: design for distinct count - 02-27-2005 , 01:27 AM



Thank you.

Can you elaborate more on yur answer please?

Quote:
First, fact table contains only keyforlevel3 and you use views for each
cube.
Quote:
If the fact table does not have keyforlevel1 and keyforlevel3, then how do
I specify a distinct count on those column?

Quote:
Second, use disabled level property of the dimension in each cube for less
disk space and better performance. For example, the cube for keyforlevel1
distinct count dose not need level2 and level3.
Quote:
Also, if i disable the lower levels in each cube how can i see the details
for that count from the cube? For example, once I have the distinct count
as say 10 at level1, then I need to find out the details of those 10
counts. And the details have to be at all levels - 1, 2 and 3


I just re read distinct count details in BOL and it mentions that distinct
count should be for the lowest level members alone.
"Distinct count measures are commonly used to determine for each member of
a dimension how many distinct, lowest-level members of another dimension
share rows in the fact table".
So I am wondering if what I am doing is wrong.

thanks again

--
Message posted via http://www.sqlmonster.com


Reply With Quote
  #4  
Old   
Ohjoo Kwon
 
Posts: n/a

Default Re: design for distinct count - 02-27-2005 , 09:50 AM



As I said, it seems that your design works well. I added something for
saving disk space and better performance.

Ohjoo


"rama via SQLMonster.com" <forum (AT) SQLMonster (DOT) com> wrote

Quote:
Thank you.

Can you elaborate more on yur answer please?


First, fact table contains only keyforlevel3 and you use views for each
cube.


If the fact table does not have keyforlevel1 and keyforlevel3, then how do
I specify a distinct count on those column?

A fact table with only keyforlevel3 will save disk space. And distinct count
is still possible through views with upper level keys for each distinct
count.


Quote:
Second, use disabled level property of the dimension in each cube for less
disk space and better performance. For example, the cube for keyforlevel1
distinct count dose not need level2 and level3.


Also, if i disable the lower levels in each cube how can i see the details
for that count from the cube? For example, once I have the distinct count
as say 10 at level1, then I need to find out the details of those 10
counts. And the details have to be at all levels - 1, 2 and 3


If you *always* still want to see the values from lower levels, your design
with no disabled setting is simpler. Furthermore, the levels can not be
disabled, if virtual dimensions based on the dimension are used together.

I just thought you don't have to see the details because every value for
lower levels has at most only 1. And if you need, you can use drillthrough
for the details.


Quote:
I just re read distinct count details in BOL and it mentions that distinct
count should be for the lowest level members alone.
"Distinct count measures are commonly used to determine for each member of
a dimension how many distinct, lowest-level members of another dimension
share rows in the fact table".
So I am wondering if what I am doing is wrong.

Logically, I understand the contents of BOL means the unit of distinc count.
For example, if there is a product dimension and we count products
distinctly, the distinct count should be done for the product level members.
But if category, then category-level members. Of course, only atomic cells
can receive data from fact table directly. For this, I mentioned disabled
property.



Quote:
thanks again

--
Message posted via http://www.sqlmonster.com



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.