dbTalk Databases Forums  

Understanding attribute relationships and aggregations in SSAS2005

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss Understanding attribute relationships and aggregations in SSAS2005 in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Kevin Dente
 
Posts: n/a

Default Understanding attribute relationships and aggregations in SSAS2005 - 08-11-2006 , 11:06 AM






(Cross posted from MSDN forums, no response there yet)

I'm working on fixing some major query performance problems in an AS2005
cube. My suspicion was that they were caused by the original cube design not
specifying attribute relationships in the time dimension. I've read the
Project REAL stuff and the P&P BI Guidelines in an effort to fully grok the
implications of attribute relationships, natural hierarchies, and
aggregations, but there's still some stuff I'm unclear on.

Our calendar hierarchy is moderately deep - Year > Month > Day > Hour >
Minute (where minute is actually a 15 minute time window). The dimension
table columns are the key, minute (0,15,30,45), hour of day (0-23), day of
month (0-days in month), month (1-12), year. Originally there were
attributes defined for each of these columns, with each attribute using its
corresponding column as its KeyColumn, and with all attributes related
directly to the key. A user hierarchy provided navigation from the year down
to the minute. Performance, in a word, sucked (drilling down in the cube
browser to a low level in the hierarchy could take as long as 45 minutes!).

I changed the KeyColumn for each attribute below year to be a composite of
each higher level attribute (so the KeyColumns in Month were Year and Month,
for Day they were Year/Month/Day, etc). Then I set the attribute
relationships accordingly. I'm not sure if this is the most efficient way to
define the attributes, but it seemed to be the only way to do it with our
current dimension table schema.

Once I made this change and re-processed, drilling through the hierarchy
went from 45 minutes to less than a second. Not a bad improvement.
However, there's still some things that I'm unclear on.

* The performance seems to be about the same (that is, pretty fast) whether
the aggregations are set to 0% or 70%. That confuses me - I thought that
setting up the relationships is what allows the aggregations to be computed.
But why would no aggregations perform so well?

* I'm assuming that by setting up these relationships, I'm giving up some
flexibility in my queries. Is that true? For example, what if I want to
compare how January compares to February, across all years. Are those
aggregations not longer being performed, because of the attribute
relationships? If so, what's the best way to deal with that? Or am I
misunderstanding something?

* I'm looking at making similar changes to a different dimension, but that
dimension has two different user hierarchies that combine the same
attributes in different ways. Does defining a natural hierarchy through
attribute relationships limit your ability to combine those attributes in
different navigational hierarchies?

Thanks,
Kevin



Reply With Quote
  #2  
Old   
Akshai Mirchandani [MS]
 
Posts: n/a

Default Re: Understanding attribute relationships and aggregations in SSAS2005 - 08-14-2006 , 03:26 PM






Some answers:

Quote:
* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the aggregations
to be computed. But why would no aggregations perform so well?
The benefits of good attribute relationships are in no way limited purely to
disk-based aggregations.

Once you have the attribute relationships, in-memory caches of data can also
be used for aggregating higher levels of data. So a query that happened to
fetch data at the hours level could make a later query at the days level
really fast because the cache for the hour-level data can be used to
aggregate into the days-level data.

Additionally, access to the user hierarchy becomes much faster if it is
natural -- and depending on the queries this can sometimes lead to huge
improvements in performance.

Quote:
* I'm assuming that by setting up these relationships, I'm giving up some
flexibility in my queries. Is that true? For example, what if I want to
compare how January compares to February, across all years. Are those
aggregations not longer being performed, because of the attribute
relationships? If so, what's the best way to deal with that? Or am I
misunderstanding something?
In a sense, yes you are losing some flexibility -- now February isn't an
entity by itself, but rather each instance of February is strongly
associated with a year. However, you can still obtain the results you need
with different solutions:
1. You can quite easily write a calculation to aggregate over the 1st child
of each year and the 2nd child of each year.
2. You can have another attribute (Month of Year) which is related to Month
(or to Day) and query by that attribute hierarchy instead.

The second solution is the simplest but you risk making your model somewhat
more complex by adding more attributes.

Quote:
* I'm looking at making similar changes to a different dimension, but that
dimension has two different user hierarchies that combine the same
attributes in different ways. Does defining a natural hierarchy through
attribute relationships limit your ability to combine those attributes in
different navigational hierarchies?
It would help to see an example of this problem.

Generally its okay if the attributes are combined together in different
ways -- but you should make sure that they follow the same rules and
semantics when creating the relationships. Make sure that the meaning of
each attribute is the same in each hierarchy and the functional dependencies
hold true and then it should turn out right.

In some cases, you may decide to actually split up attributes (like Month
and Month Of Year) because they really mean different things in the two
hierarchies. But that is a choice and design decision that you need to make
based on what kind of behavior you want to obtain.

HTH,
Akshai

--
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no rights
Please do not send email directly to this alias. This alias is for newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote

Quote:
(Cross posted from MSDN forums, no response there yet)

I'm working on fixing some major query performance problems in an AS2005
cube. My suspicion was that they were caused by the original cube design
not specifying attribute relationships in the time dimension. I've read
the Project REAL stuff and the P&P BI Guidelines in an effort to fully
grok the implications of attribute relationships, natural hierarchies, and
aggregations, but there's still some stuff I'm unclear on.

Our calendar hierarchy is moderately deep - Year > Month > Day > Hour
Minute (where minute is actually a 15 minute time window). The dimension
table columns are the key, minute (0,15,30,45), hour of day (0-23), day of
month (0-days in month), month (1-12), year. Originally there were
attributes defined for each of these columns, with each attribute using
its corresponding column as its KeyColumn, and with all attributes related
directly to the key. A user hierarchy provided navigation from the year
down to the minute. Performance, in a word, sucked (drilling down in the
cube browser to a low level in the hierarchy could take as long as 45
minutes!).

I changed the KeyColumn for each attribute below year to be a composite of
each higher level attribute (so the KeyColumns in Month were Year and
Month, for Day they were Year/Month/Day, etc). Then I set the attribute
relationships accordingly. I'm not sure if this is the most efficient way
to define the attributes, but it seemed to be the only way to do it with
our current dimension table schema.

Once I made this change and re-processed, drilling through the hierarchy
went from 45 minutes to less than a second. Not a bad improvement.
However, there's still some things that I'm unclear on.

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the aggregations
to be computed. But why would no aggregations perform so well?

* I'm assuming that by setting up these relationships, I'm giving up some
flexibility in my queries. Is that true? For example, what if I want to
compare how January compares to February, across all years. Are those
aggregations not longer being performed, because of the attribute
relationships? If so, what's the best way to deal with that? Or am I
misunderstanding something?

* I'm looking at making similar changes to a different dimension, but that
dimension has two different user hierarchies that combine the same
attributes in different ways. Does defining a natural hierarchy through
attribute relationships limit your ability to combine those attributes in
different navigational hierarchies?

Thanks,
Kevin




Reply With Quote
  #3  
Old   
Kevin Dente
 
Posts: n/a

Default Re: Understanding attribute relationships and aggregations in SSAS2005 - 08-14-2006 , 07:44 PM



Thanks Akshai! Great info.

Quote:
So a query that happened to fetch data at the hours level could make a
later query at the days level really fast because the cache for the
hour-level data can be used to aggregate into the days-level data.
In my case, I was navigating down the hierarchy from year to minute
(high-to-low-level), so I'm assuming your example doesn't apply here.

Quote:
Additionally, access to the user hierarchy becomes much faster if it is
natural -- and depending on the queries this can sometimes lead to huge
improvements in performance.
Is there any more you can say here? This is the situation for me - after
making the user hierarchy into a natural hierachy, drilling down sped up so
dramatically it was amazing. But it all seems a bit like magic. If
disk-based aggregations aren't to thank, then what?

Quote:
2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.
Looking around the AdvWorks cube, I noticed that pattern. I'll keep it in
mind. I just wanted to make sure I was understanding things correctly.

Quote:
It would help to see an example of this problem.
OK, in this case we have a set of business processes, each of which are made
up activities. Processes can be versioned, so each activity has a version
associated with it. One user hierarchy looks something this:

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular version
of a process.

However, for comparitive analysis, it's also useful to have the following
hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

This lets you easily compare at two different versions of the same activity.

So you see this is actually the same set of attributes, stacked in different
ways. I'm not sure how I should design the natural hierarchies here (if at
all).



"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote

Quote:
Some answers:

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the aggregations
to be computed. But why would no aggregations perform so well?

The benefits of good attribute relationships are in no way limited purely
to disk-based aggregations.

Once you have the attribute relationships, in-memory caches of data can
also be used for aggregating higher levels of data. So a query that
happened to fetch data at the hours level could make a later query at the
days level really fast because the cache for the hour-level data can be
used to aggregate into the days-level data.

Additionally, access to the user hierarchy becomes much faster if it is
natural -- and depending on the queries this can sometimes lead to huge
improvements in performance.

* I'm assuming that by setting up these relationships, I'm giving up some
flexibility in my queries. Is that true? For example, what if I want to
compare how January compares to February, across all years. Are those
aggregations not longer being performed, because of the attribute
relationships? If so, what's the best way to deal with that? Or am I
misunderstanding something?

In a sense, yes you are losing some flexibility -- now February isn't an
entity by itself, but rather each instance of February is strongly
associated with a year. However, you can still obtain the results you need
with different solutions:
1. You can quite easily write a calculation to aggregate over the 1st
child of each year and the 2nd child of each year.
2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.

The second solution is the simplest but you risk making your model
somewhat more complex by adding more attributes.

* I'm looking at making similar changes to a different dimension, but
that dimension has two different user hierarchies that combine the same
attributes in different ways. Does defining a natural hierarchy through
attribute relationships limit your ability to combine those attributes in
different navigational hierarchies?

It would help to see an example of this problem.

Generally its okay if the attributes are combined together in different
ways -- but you should make sure that they follow the same rules and
semantics when creating the relationships. Make sure that the meaning of
each attribute is the same in each hierarchy and the functional
dependencies hold true and then it should turn out right.

In some cases, you may decide to actually split up attributes (like Month
and Month Of Year) because they really mean different things in the two
hierarchies. But that is a choice and design decision that you need to
make based on what kind of behavior you want to obtain.

HTH,
Akshai

--
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uB62pAWvGHA.4444 (AT) TK2MSFTNGP05 (DOT) phx.gbl...
(Cross posted from MSDN forums, no response there yet)

I'm working on fixing some major query performance problems in an AS2005
cube. My suspicion was that they were caused by the original cube design
not specifying attribute relationships in the time dimension. I've read
the Project REAL stuff and the P&P BI Guidelines in an effort to fully
grok the implications of attribute relationships, natural hierarchies,
and aggregations, but there's still some stuff I'm unclear on.

Our calendar hierarchy is moderately deep - Year > Month > Day > Hour
Minute (where minute is actually a 15 minute time window). The dimension
table columns are the key, minute (0,15,30,45), hour of day (0-23), day
of month (0-days in month), month (1-12), year. Originally there were
attributes defined for each of these columns, with each attribute using
its corresponding column as its KeyColumn, and with all attributes
related directly to the key. A user hierarchy provided navigation from
the year down to the minute. Performance, in a word, sucked (drilling
down in the cube browser to a low level in the hierarchy could take as
long as 45 minutes!).

I changed the KeyColumn for each attribute below year to be a composite
of each higher level attribute (so the KeyColumns in Month were Year and
Month, for Day they were Year/Month/Day, etc). Then I set the attribute
relationships accordingly. I'm not sure if this is the most efficient way
to define the attributes, but it seemed to be the only way to do it with
our current dimension table schema.

Once I made this change and re-processed, drilling through the hierarchy
went from 45 minutes to less than a second. Not a bad improvement.
However, there's still some things that I'm unclear on.

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the aggregations
to be computed. But why would no aggregations perform so well?

* I'm assuming that by setting up these relationships, I'm giving up some
flexibility in my queries. Is that true? For example, what if I want to
compare how January compares to February, across all years. Are those
aggregations not longer being performed, because of the attribute
relationships? If so, what's the best way to deal with that? Or am I
misunderstanding something?

* I'm looking at making similar changes to a different dimension, but
that dimension has two different user hierarchies that combine the same
attributes in different ways. Does defining a natural hierarchy through
attribute relationships limit your ability to combine those attributes in
different navigational hierarchies?

Thanks,
Kevin






Reply With Quote
  #4  
Old   
Akshai Mirchandani [MS]
 
Posts: n/a

Default Re: Understanding attribute relationships and aggregations in SSAS2005 - 08-14-2006 , 08:51 PM



More answers:

Quote:
Is there any more you can say here? This is the situation for me - after
making the user hierarchy into a natural hierachy, drilling down sped up
so dramatically it was amazing. But it all seems a bit like magic. If
disk-based aggregations aren't to thank, then what?
A hierarchy built on top of a natural hierarchy can materialize the children
of each parent. So a request to the storage engine that asks (for example)
for the children of a member can simply ask for an iterator over a
pre-materialized sub-tree of the children. But a hierarchy that is built on
an unnatural hierarchy has to send queries to the "raw" dimension data, do
ordering on top of the result and materialize the sub-tree "on the fly". So
unnatural hierarchies are much slower to iterate over for some common set
operations.

Quote:
System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular version
of a process.

However, for comparitive analysis, it's also useful to have the following
hierarchy as well:

System > Process > Activity Group > Activity > Activity Version
The combination above looks like this in terms of attribute relationships:

System (1,2)
Quote:
Process (1,2)

Version (1)

Activity Group (1,2)

Activity (1,2)

Activity Version (2)

A natural user hierarchy can skip attributes along the way of the
relationship chain. So hierarchy (1) and (2) can have levels as shown above
and still be natural.

HTH,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no rights
Please do not send email directly to this alias. This alias is for newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote

Quote:
Thanks Akshai! Great info.

So a query that happened to fetch data at the hours level could make a
later query at the days level really fast because the cache for the
hour-level data can be used to aggregate into the days-level data.

In my case, I was navigating down the hierarchy from year to minute
(high-to-low-level), so I'm assuming your example doesn't apply here.

Additionally, access to the user hierarchy becomes much faster if it is
natural -- and depending on the queries this can sometimes lead to huge
improvements in performance.

Is there any more you can say here? This is the situation for me - after
making the user hierarchy into a natural hierachy, drilling down sped up
so dramatically it was amazing. But it all seems a bit like magic. If
disk-based aggregations aren't to thank, then what?

2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.

Looking around the AdvWorks cube, I noticed that pattern. I'll keep it in
mind. I just wanted to make sure I was understanding things correctly.

It would help to see an example of this problem.

OK, in this case we have a set of business processes, each of which are
made up activities. Processes can be versioned, so each activity has a
version associated with it. One user hierarchy looks something this:

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular version
of a process.

However, for comparitive analysis, it's also useful to have the following
hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

This lets you easily compare at two different versions of the same
activity.

So you see this is actually the same set of attributes, stacked in
different ways. I'm not sure how I should design the natural hierarchies
here (if at all).



"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in message
news:%23kF2q$9vGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Some answers:

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform so
well?

The benefits of good attribute relationships are in no way limited purely
to disk-based aggregations.

Once you have the attribute relationships, in-memory caches of data can
also be used for aggregating higher levels of data. So a query that
happened to fetch data at the hours level could make a later query at the
days level really fast because the cache for the hour-level data can be
used to aggregate into the days-level data.

Additionally, access to the user hierarchy becomes much faster if it is
natural -- and depending on the queries this can sometimes lead to huge
improvements in performance.

* I'm assuming that by setting up these relationships, I'm giving up
some flexibility in my queries. Is that true? For example, what if I
want to compare how January compares to February, across all years. Are
those aggregations not longer being performed, because of the attribute
relationships? If so, what's the best way to deal with that? Or am I
misunderstanding something?

In a sense, yes you are losing some flexibility -- now February isn't an
entity by itself, but rather each instance of February is strongly
associated with a year. However, you can still obtain the results you
need with different solutions:
1. You can quite easily write a calculation to aggregate over the 1st
child of each year and the 2nd child of each year.
2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.

The second solution is the simplest but you risk making your model
somewhat more complex by adding more attributes.

* I'm looking at making similar changes to a different dimension, but
that dimension has two different user hierarchies that combine the same
attributes in different ways. Does defining a natural hierarchy through
attribute relationships limit your ability to combine those attributes
in different navigational hierarchies?

It would help to see an example of this problem.

Generally its okay if the attributes are combined together in different
ways -- but you should make sure that they follow the same rules and
semantics when creating the relationships. Make sure that the meaning of
each attribute is the same in each hierarchy and the functional
dependencies hold true and then it should turn out right.

In some cases, you may decide to actually split up attributes (like Month
and Month Of Year) because they really mean different things in the two
hierarchies. But that is a choice and design decision that you need to
make based on what kind of behavior you want to obtain.

HTH,
Akshai

--
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uB62pAWvGHA.4444 (AT) TK2MSFTNGP05 (DOT) phx.gbl...
(Cross posted from MSDN forums, no response there yet)

I'm working on fixing some major query performance problems in an AS2005
cube. My suspicion was that they were caused by the original cube design
not specifying attribute relationships in the time dimension. I've read
the Project REAL stuff and the P&P BI Guidelines in an effort to fully
grok the implications of attribute relationships, natural hierarchies,
and aggregations, but there's still some stuff I'm unclear on.

Our calendar hierarchy is moderately deep - Year > Month > Day > Hour
Minute (where minute is actually a 15 minute time window). The dimension
table columns are the key, minute (0,15,30,45), hour of day (0-23), day
of month (0-days in month), month (1-12), year. Originally there were
attributes defined for each of these columns, with each attribute using
its corresponding column as its KeyColumn, and with all attributes
related directly to the key. A user hierarchy provided navigation from
the year down to the minute. Performance, in a word, sucked (drilling
down in the cube browser to a low level in the hierarchy could take as
long as 45 minutes!).

I changed the KeyColumn for each attribute below year to be a composite
of each higher level attribute (so the KeyColumns in Month were Year and
Month, for Day they were Year/Month/Day, etc). Then I set the attribute
relationships accordingly. I'm not sure if this is the most efficient
way to define the attributes, but it seemed to be the only way to do it
with our current dimension table schema.

Once I made this change and re-processed, drilling through the hierarchy
went from 45 minutes to less than a second. Not a bad improvement.
However, there's still some things that I'm unclear on.

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform so
well?

* I'm assuming that by setting up these relationships, I'm giving up
some flexibility in my queries. Is that true? For example, what if I
want to compare how January compares to February, across all years. Are
those aggregations not longer being performed, because of the attribute
relationships? If so, what's the best way to deal with that? Or am I
misunderstanding something?

* I'm looking at making similar changes to a different dimension, but
that dimension has two different user hierarchies that combine the same
attributes in different ways. Does defining a natural hierarchy through
attribute relationships limit your ability to combine those attributes
in different navigational hierarchies?

Thanks,
Kevin








Reply With Quote
  #5  
Old   
Kevin Dente
 
Posts: n/a

Default Re: Understanding attribute relationships and aggregations in SSAS2005 - 08-15-2006 , 11:47 AM



Quote:
A hierarchy built on top of a natural hierarchy can materialize the
children of each parent....
Um..ok, I think I maybe kinda sorta get what you're saying. As an example,
when attribute relationships are in place and I'm drilling down into the
hour level of July 15, 2006, SSAS knows the 24 specific hours that belong to
that day, and only needs to compute the cells for that subset. But when
there are no relationships, it needs to scan through all hours of all days,
figure out which to include, and then do the calculations. I'm sure I'm not
getting that exactly right, but is that sort of the general idea?

Quote:
A natural user hierarchy can skip attributes along the way of the
relationship chain.
Apologies, Akshai, I was both imprecise and inaccurate in my original
description of our hiearchies. Let me try this one more time. They are
actually:

System > Process > Process Version > Activity Group > Activity

and

System > Process > Activity Group > Activity > Process Version

In other words, the exact same Process Version attribute is used, but in two
different places in the hiearchies. We're not "skipping" levels here.


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote

Quote:
More answers:

Is there any more you can say here? This is the situation for me - after
making the user hierarchy into a natural hierachy, drilling down sped up
so dramatically it was amazing. But it all seems a bit like magic. If
disk-based aggregations aren't to thank, then what?

A hierarchy built on top of a natural hierarchy can materialize the
children of each parent. So a request to the storage engine that asks (for
example) for the children of a member can simply ask for an iterator over
a pre-materialized sub-tree of the children. But a hierarchy that is built
on an unnatural hierarchy has to send queries to the "raw" dimension data,
do ordering on top of the result and materialize the sub-tree "on the
fly". So unnatural hierarchies are much slower to iterate over for some
common set operations.

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the following
hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

The combination above looks like this in terms of attribute relationships:

System (1,2)
|
Process (1,2)
|
Version (1)
|
Activity Group (1,2)
|
Activity (1,2)
|
Activity Version (2)

A natural user hierarchy can skip attributes along the way of the
relationship chain. So hierarchy (1) and (2) can have levels as shown
above and still be natural.

HTH,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:evcPvPAwGHA.4756 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Thanks Akshai! Great info.

So a query that happened to fetch data at the hours level could make a
later query at the days level really fast because the cache for the
hour-level data can be used to aggregate into the days-level data.

In my case, I was navigating down the hierarchy from year to minute
(high-to-low-level), so I'm assuming your example doesn't apply here.

Additionally, access to the user hierarchy becomes much faster if it is
natural -- and depending on the queries this can sometimes lead to huge
improvements in performance.

Is there any more you can say here? This is the situation for me - after
making the user hierarchy into a natural hierachy, drilling down sped up
so dramatically it was amazing. But it all seems a bit like magic. If
disk-based aggregations aren't to thank, then what?

2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.

Looking around the AdvWorks cube, I noticed that pattern. I'll keep it in
mind. I just wanted to make sure I was understanding things correctly.

It would help to see an example of this problem.

OK, in this case we have a set of business processes, each of which are
made up activities. Processes can be versioned, so each activity has a
version associated with it. One user hierarchy looks something this:

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the following
hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

This lets you easily compare at two different versions of the same
activity.

So you see this is actually the same set of attributes, stacked in
different ways. I'm not sure how I should design the natural hierarchies
here (if at all).



"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in message
news:%23kF2q$9vGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Some answers:

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform so
well?

The benefits of good attribute relationships are in no way limited
purely to disk-based aggregations.

Once you have the attribute relationships, in-memory caches of data can
also be used for aggregating higher levels of data. So a query that
happened to fetch data at the hours level could make a later query at
the days level really fast because the cache for the hour-level data can
be used to aggregate into the days-level data.

Additionally, access to the user hierarchy becomes much faster if it is
natural -- and depending on the queries this can sometimes lead to huge
improvements in performance.

* I'm assuming that by setting up these relationships, I'm giving up
some flexibility in my queries. Is that true? For example, what if I
want to compare how January compares to February, across all years. Are
those aggregations not longer being performed, because of the attribute
relationships? If so, what's the best way to deal with that? Or am I
misunderstanding something?

In a sense, yes you are losing some flexibility -- now February isn't an
entity by itself, but rather each instance of February is strongly
associated with a year. However, you can still obtain the results you
need with different solutions:
1. You can quite easily write a calculation to aggregate over the 1st
child of each year and the 2nd child of each year.
2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.

The second solution is the simplest but you risk making your model
somewhat more complex by adding more attributes.

* I'm looking at making similar changes to a different dimension, but
that dimension has two different user hierarchies that combine the same
attributes in different ways. Does defining a natural hierarchy through
attribute relationships limit your ability to combine those attributes
in different navigational hierarchies?

It would help to see an example of this problem.

Generally its okay if the attributes are combined together in different
ways -- but you should make sure that they follow the same rules and
semantics when creating the relationships. Make sure that the meaning of
each attribute is the same in each hierarchy and the functional
dependencies hold true and then it should turn out right.

In some cases, you may decide to actually split up attributes (like
Month and Month Of Year) because they really mean different things in
the two hierarchies. But that is a choice and design decision that you
need to make based on what kind of behavior you want to obtain.

HTH,
Akshai

--
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uB62pAWvGHA.4444 (AT) TK2MSFTNGP05 (DOT) phx.gbl...
(Cross posted from MSDN forums, no response there yet)

I'm working on fixing some major query performance problems in an
AS2005 cube. My suspicion was that they were caused by the original
cube design not specifying attribute relationships in the time
dimension. I've read the Project REAL stuff and the P&P BI Guidelines
in an effort to fully grok the implications of attribute relationships,
natural hierarchies, and aggregations, but there's still some stuff I'm
unclear on.

Our calendar hierarchy is moderately deep - Year > Month > Day > Hour
Minute (where minute is actually a 15 minute time window). The
dimension table columns are the key, minute (0,15,30,45), hour of day
(0-23), day of month (0-days in month), month (1-12), year. Originally
there were attributes defined for each of these columns, with each
attribute using its corresponding column as its KeyColumn, and with all
attributes related directly to the key. A user hierarchy provided
navigation from the year down to the minute. Performance, in a word,
sucked (drilling down in the cube browser to a low level in the
hierarchy could take as long as 45 minutes!).

I changed the KeyColumn for each attribute below year to be a composite
of each higher level attribute (so the KeyColumns in Month were Year
and Month, for Day they were Year/Month/Day, etc). Then I set the
attribute relationships accordingly. I'm not sure if this is the most
efficient way to define the attributes, but it seemed to be the only
way to do it with our current dimension table schema.

Once I made this change and re-processed, drilling through the
hierarchy went from 45 minutes to less than a second. Not a bad
improvement. However, there's still some things that I'm unclear on.

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform so
well?

* I'm assuming that by setting up these relationships, I'm giving up
some flexibility in my queries. Is that true? For example, what if I
want to compare how January compares to February, across all years. Are
those aggregations not longer being performed, because of the attribute
relationships? If so, what's the best way to deal with that? Or am I
misunderstanding something?

* I'm looking at making similar changes to a different dimension, but
that dimension has two different user hierarchies that combine the same
attributes in different ways. Does defining a natural hierarchy through
attribute relationships limit your ability to combine those attributes
in different navigational hierarchies?

Thanks,
Kevin










Reply With Quote
  #6  
Old   
Akshai Mirchandani [MS]
 
Posts: n/a

Default Re: Understanding attribute relationships and aggregations in SSAS2005 - 08-15-2006 , 02:15 PM



Quote:
figure out which to include, and then do the calculations. I'm sure I'm
not getting that exactly right, but is that sort of the general idea?
Essentially yes. (Although using the word calculations is a little
misleading -- its more like doing joins between attributes)

Quote:
In other words, the exact same Process Version attribute is used, but in
two different places in the hiearchies. We're not "skipping" levels here.
In that case you need to define the semantics of Process Version -- it
doesn't usually make sense to have a loop in functional dependencies (one
exception may be 1:1 relationships but attribute relationships don't deal
with that).

You can once again do the same thing as before -- create two attributes for
Process Version, one which is a related attribute of Activity Group and the
other which is related to Activity. Give the two attributes different
semantics though, because their uniqueness and behavior with respect to the
other attributes will be quite different. In one case Process Version
appears to be the key of the dimension, and in the other it is somewhere in
the middle of the hierarchy.

Alternatively, you can go with one of the hierarchies being natural and the
other being unnatural -- but it really depends on what you are trying to do
here...

Thanks,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no rights
Please do not send email directly to this alias. This alias is for newsgroup
purposes only.


"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote

Quote:
A hierarchy built on top of a natural hierarchy can materialize the
children of each parent....

Um..ok, I think I maybe kinda sorta get what you're saying. As an example,
when attribute relationships are in place and I'm drilling down into the
hour level of July 15, 2006, SSAS knows the 24 specific hours that belong
to that day, and only needs to compute the cells for that subset. But when
there are no relationships, it needs to scan through all hours of all
days, figure out which to include, and then do the calculations. I'm sure
I'm not getting that exactly right, but is that sort of the general idea?

A natural user hierarchy can skip attributes along the way of the
relationship chain.

Apologies, Akshai, I was both imprecise and inaccurate in my original
description of our hiearchies. Let me try this one more time. They are
actually:

System > Process > Process Version > Activity Group > Activity

and

System > Process > Activity Group > Activity > Process Version

In other words, the exact same Process Version attribute is used, but in
two different places in the hiearchies. We're not "skipping" levels here.


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in message
news:OOEuN1AwGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
More answers:

Is there any more you can say here? This is the situation for me - after
making the user hierarchy into a natural hierachy, drilling down sped up
so dramatically it was amazing. But it all seems a bit like magic. If
disk-based aggregations aren't to thank, then what?

A hierarchy built on top of a natural hierarchy can materialize the
children of each parent. So a request to the storage engine that asks
(for example) for the children of a member can simply ask for an iterator
over a pre-materialized sub-tree of the children. But a hierarchy that is
built on an unnatural hierarchy has to send queries to the "raw"
dimension data, do ordering on top of the result and materialize the
sub-tree "on the fly". So unnatural hierarchies are much slower to
iterate over for some common set operations.

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the
following hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

The combination above looks like this in terms of attribute
relationships:

System (1,2)
|
Process (1,2)
|
Version (1)
|
Activity Group (1,2)
|
Activity (1,2)
|
Activity Version (2)

A natural user hierarchy can skip attributes along the way of the
relationship chain. So hierarchy (1) and (2) can have levels as shown
above and still be natural.

HTH,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:evcPvPAwGHA.4756 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Thanks Akshai! Great info.

So a query that happened to fetch data at the hours level could make a
later query at the days level really fast because the cache for the
hour-level data can be used to aggregate into the days-level data.

In my case, I was navigating down the hierarchy from year to minute
(high-to-low-level), so I'm assuming your example doesn't apply here.

Additionally, access to the user hierarchy becomes much faster if it is
natural -- and depending on the queries this can sometimes lead to huge
improvements in performance.

Is there any more you can say here? This is the situation for me - after
making the user hierarchy into a natural hierachy, drilling down sped up
so dramatically it was amazing. But it all seems a bit like magic. If
disk-based aggregations aren't to thank, then what?

2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.

Looking around the AdvWorks cube, I noticed that pattern. I'll keep it
in mind. I just wanted to make sure I was understanding things
correctly.

It would help to see an example of this problem.

OK, in this case we have a set of business processes, each of which are
made up activities. Processes can be versioned, so each activity has a
version associated with it. One user hierarchy looks something this:

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the
following hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

This lets you easily compare at two different versions of the same
activity.

So you see this is actually the same set of attributes, stacked in
different ways. I'm not sure how I should design the natural hierarchies
here (if at all).



"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in
message news:%23kF2q$9vGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Some answers:

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform so
well?

The benefits of good attribute relationships are in no way limited
purely to disk-based aggregations.

Once you have the attribute relationships, in-memory caches of data can
also be used for aggregating higher levels of data. So a query that
happened to fetch data at the hours level could make a later query at
the days level really fast because the cache for the hour-level data
can be used to aggregate into the days-level data.

Additionally, access to the user hierarchy becomes much faster if it is
natural -- and depending on the queries this can sometimes lead to huge
improvements in performance.

* I'm assuming that by setting up these relationships, I'm giving up
some flexibility in my queries. Is that true? For example, what if I
want to compare how January compares to February, across all years.
Are those aggregations not longer being performed, because of the
attribute relationships? If so, what's the best way to deal with that?
Or am I misunderstanding something?

In a sense, yes you are losing some flexibility -- now February isn't
an entity by itself, but rather each instance of February is strongly
associated with a year. However, you can still obtain the results you
need with different solutions:
1. You can quite easily write a calculation to aggregate over the 1st
child of each year and the 2nd child of each year.
2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.

The second solution is the simplest but you risk making your model
somewhat more complex by adding more attributes.

* I'm looking at making similar changes to a different dimension, but
that dimension has two different user hierarchies that combine the
same attributes in different ways. Does defining a natural hierarchy
through attribute relationships limit your ability to combine those
attributes in different navigational hierarchies?

It would help to see an example of this problem.

Generally its okay if the attributes are combined together in different
ways -- but you should make sure that they follow the same rules and
semantics when creating the relationships. Make sure that the meaning
of each attribute is the same in each hierarchy and the functional
dependencies hold true and then it should turn out right.

In some cases, you may decide to actually split up attributes (like
Month and Month Of Year) because they really mean different things in
the two hierarchies. But that is a choice and design decision that you
need to make based on what kind of behavior you want to obtain.

HTH,
Akshai

--
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uB62pAWvGHA.4444 (AT) TK2MSFTNGP05 (DOT) phx.gbl...
(Cross posted from MSDN forums, no response there yet)

I'm working on fixing some major query performance problems in an
AS2005 cube. My suspicion was that they were caused by the original
cube design not specifying attribute relationships in the time
dimension. I've read the Project REAL stuff and the P&P BI Guidelines
in an effort to fully grok the implications of attribute
relationships, natural hierarchies, and aggregations, but there's
still some stuff I'm unclear on.

Our calendar hierarchy is moderately deep - Year > Month > Day > Hour
Minute (where minute is actually a 15 minute time window). The
dimension table columns are the key, minute (0,15,30,45), hour of day
(0-23), day of month (0-days in month), month (1-12), year. Originally
there were attributes defined for each of these columns, with each
attribute using its corresponding column as its KeyColumn, and with
all attributes related directly to the key. A user hierarchy provided
navigation from the year down to the minute. Performance, in a word,
sucked (drilling down in the cube browser to a low level in the
hierarchy could take as long as 45 minutes!).

I changed the KeyColumn for each attribute below year to be a
composite of each higher level attribute (so the KeyColumns in Month
were Year and Month, for Day they were Year/Month/Day, etc). Then I
set the attribute relationships accordingly. I'm not sure if this is
the most efficient way to define the attributes, but it seemed to be
the only way to do it with our current dimension table schema.

Once I made this change and re-processed, drilling through the
hierarchy went from 45 minutes to less than a second. Not a bad
improvement. However, there's still some things that I'm unclear
on.

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform so
well?

* I'm assuming that by setting up these relationships, I'm giving up
some flexibility in my queries. Is that true? For example, what if I
want to compare how January compares to February, across all years.
Are those aggregations not longer being performed, because of the
attribute relationships? If so, what's the best way to deal with that?
Or am I misunderstanding something?

* I'm looking at making similar changes to a different dimension, but
that dimension has two different user hierarchies that combine the
same attributes in different ways. Does defining a natural hierarchy
through attribute relationships limit your ability to combine those
attributes in different navigational hierarchies?

Thanks,
Kevin












Reply With Quote
  #7  
Old   
Kevin Dente
 
Posts: n/a

Default Re: Understanding attribute relationships and aggregations in SSAS2005 - 08-18-2006 , 07:46 PM



OK, thanks for the info, Akshai.

One follow up question on the topic. I only just realized something -
changing the dimension definitions to include attribute relationships ended
up changing the unique names of the dimension members. What was:

[Calendar].[Calendar].[Year].&[2006].&[8].&[18]

without relationships became

[Calendar].[Calendar].[Day].&[2006]&[8]&[18]

with them. We have some code that constructs dimension members for dynamic
queries, and that broke (started returning null values).

I haven't found much documentation about how those member names are created,
and I want to make sure I understand how we should be constructing them for
our queries. Anything you can point me to here?

One last comment about attribute relationships - this seems like such an
important area of dimension design, yet I feel like the BOL don't do it
justice. Fortunately the project REAL and P&P BI guide call it out as key. I
hope the coverage of this area in the BOL is expanded in the future. Or
perhaps a separate MSDN article or white paper - something that accentuates
how important they are, and what the implications of using them (and not
using them) are.


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote

Quote:
figure out which to include, and then do the calculations. I'm sure I'm
not getting that exactly right, but is that sort of the general idea?

Essentially yes. (Although using the word calculations is a little
misleading -- its more like doing joins between attributes)

In other words, the exact same Process Version attribute is used, but in
two different places in the hiearchies. We're not "skipping" levels here.

In that case you need to define the semantics of Process Version -- it
doesn't usually make sense to have a loop in functional dependencies (one
exception may be 1:1 relationships but attribute relationships don't deal
with that).

You can once again do the same thing as before -- create two attributes
for Process Version, one which is a related attribute of Activity Group
and the other which is related to Activity. Give the two attributes
different semantics though, because their uniqueness and behavior with
respect to the other attributes will be quite different. In one case
Process Version appears to be the key of the dimension, and in the other
it is somewhere in the middle of the hierarchy.

Alternatively, you can go with one of the hierarchies being natural and
the other being unnatural -- but it really depends on what you are trying
to do here...

Thanks,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.


"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uXBtNqIwGHA.2204 (AT) TK2MSFTNGP03 (DOT) phx.gbl...
A hierarchy built on top of a natural hierarchy can materialize the
children of each parent....

Um..ok, I think I maybe kinda sorta get what you're saying. As an
example, when attribute relationships are in place and I'm drilling down
into the hour level of July 15, 2006, SSAS knows the 24 specific hours
that belong to that day, and only needs to compute the cells for that
subset. But when there are no relationships, it needs to scan through all
hours of all days, figure out which to include, and then do the
calculations. I'm sure I'm not getting that exactly right, but is that
sort of the general idea?

A natural user hierarchy can skip attributes along the way of the
relationship chain.

Apologies, Akshai, I was both imprecise and inaccurate in my original
description of our hiearchies. Let me try this one more time. They are
actually:

System > Process > Process Version > Activity Group > Activity

and

System > Process > Activity Group > Activity > Process Version

In other words, the exact same Process Version attribute is used, but in
two different places in the hiearchies. We're not "skipping" levels here.


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in message
news:OOEuN1AwGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
More answers:

Is there any more you can say here? This is the situation for me -
after making the user hierarchy into a natural hierachy, drilling down
sped up so dramatically it was amazing. But it all seems a bit like
magic. If disk-based aggregations aren't to thank, then what?

A hierarchy built on top of a natural hierarchy can materialize the
children of each parent. So a request to the storage engine that asks
(for example) for the children of a member can simply ask for an
iterator over a pre-materialized sub-tree of the children. But a
hierarchy that is built on an unnatural hierarchy has to send queries to
the "raw" dimension data, do ordering on top of the result and
materialize the sub-tree "on the fly". So unnatural hierarchies are much
slower to iterate over for some common set operations.

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the
following hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

The combination above looks like this in terms of attribute
relationships:

System (1,2)
|
Process (1,2)
|
Version (1)
|
Activity Group (1,2)
|
Activity (1,2)
|
Activity Version (2)

A natural user hierarchy can skip attributes along the way of the
relationship chain. So hierarchy (1) and (2) can have levels as shown
above and still be natural.

HTH,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:evcPvPAwGHA.4756 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Thanks Akshai! Great info.

So a query that happened to fetch data at the hours level could make a
later query at the days level really fast because the cache for the
hour-level data can be used to aggregate into the days-level data.

In my case, I was navigating down the hierarchy from year to minute
(high-to-low-level), so I'm assuming your example doesn't apply here.

Additionally, access to the user hierarchy becomes much faster if it
is natural -- and depending on the queries this can sometimes lead to
huge improvements in performance.

Is there any more you can say here? This is the situation for me -
after making the user hierarchy into a natural hierachy, drilling down
sped up so dramatically it was amazing. But it all seems a bit like
magic. If disk-based aggregations aren't to thank, then what?

2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.

Looking around the AdvWorks cube, I noticed that pattern. I'll keep it
in mind. I just wanted to make sure I was understanding things
correctly.

It would help to see an example of this problem.

OK, in this case we have a set of business processes, each of which are
made up activities. Processes can be versioned, so each activity has a
version associated with it. One user hierarchy looks something this:

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the
following hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

This lets you easily compare at two different versions of the same
activity.

So you see this is actually the same set of attributes, stacked in
different ways. I'm not sure how I should design the natural
hierarchies here (if at all).



"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in
message news:%23kF2q$9vGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Some answers:

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform so
well?

The benefits of good attribute relationships are in no way limited
purely to disk-based aggregations.

Once you have the attribute relationships, in-memory caches of data
can also be used for aggregating higher levels of data. So a query
that happened to fetch data at the hours level could make a later
query at the days level really fast because the cache for the
hour-level data can be used to aggregate into the days-level data.

Additionally, access to the user hierarchy becomes much faster if it
is natural -- and depending on the queries this can sometimes lead to
huge improvements in performance.

* I'm assuming that by setting up these relationships, I'm giving up
some flexibility in my queries. Is that true? For example, what if I
want to compare how January compares to February, across all years.
Are those aggregations not longer being performed, because of the
attribute relationships? If so, what's the best way to deal with
that? Or am I misunderstanding something?

In a sense, yes you are losing some flexibility -- now February isn't
an entity by itself, but rather each instance of February is strongly
associated with a year. However, you can still obtain the results you
need with different solutions:
1. You can quite easily write a calculation to aggregate over the 1st
child of each year and the 2nd child of each year.
2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.

The second solution is the simplest but you risk making your model
somewhat more complex by adding more attributes.

* I'm looking at making similar changes to a different dimension, but
that dimension has two different user hierarchies that combine the
same attributes in different ways. Does defining a natural hierarchy
through attribute relationships limit your ability to combine those
attributes in different navigational hierarchies?

It would help to see an example of this problem.

Generally its okay if the attributes are combined together in
different ways -- but you should make sure that they follow the same
rules and semantics when creating the relationships. Make sure that
the meaning of each attribute is the same in each hierarchy and the
functional dependencies hold true and then it should turn out right.

In some cases, you may decide to actually split up attributes (like
Month and Month Of Year) because they really mean different things in
the two hierarchies. But that is a choice and design decision that you
need to make based on what kind of behavior you want to obtain.

HTH,
Akshai

--
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uB62pAWvGHA.4444 (AT) TK2MSFTNGP05 (DOT) phx.gbl...
(Cross posted from MSDN forums, no response there yet)

I'm working on fixing some major query performance problems in an
AS2005 cube. My suspicion was that they were caused by the original
cube design not specifying attribute relationships in the time
dimension. I've read the Project REAL stuff and the P&P BI Guidelines
in an effort to fully grok the implications of attribute
relationships, natural hierarchies, and aggregations, but there's
still some stuff I'm unclear on.

Our calendar hierarchy is moderately deep - Year > Month > Day > Hour
Minute (where minute is actually a 15 minute time window). The
dimension table columns are the key, minute (0,15,30,45), hour of day
(0-23), day of month (0-days in month), month (1-12), year.
Originally there were attributes defined for each of these columns,
with each attribute using its corresponding column as its KeyColumn,
and with all attributes related directly to the key. A user hierarchy
provided navigation from the year down to the minute. Performance, in
a word, sucked (drilling down in the cube browser to a low level in
the hierarchy could take as long as 45 minutes!).

I changed the KeyColumn for each attribute below year to be a
composite of each higher level attribute (so the KeyColumns in Month
were Year and Month, for Day they were Year/Month/Day, etc). Then I
set the attribute relationships accordingly. I'm not sure if this is
the most efficient way to define the attributes, but it seemed to be
the only way to do it with our current dimension table schema.

Once I made this change and re-processed, drilling through the
hierarchy went from 45 minutes to less than a second. Not a bad
improvement. However, there's still some things that I'm unclear
on.

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform so
well?

* I'm assuming that by setting up these relationships, I'm giving up
some flexibility in my queries. Is that true? For example, what if I
want to compare how January compares to February, across all years.
Are those aggregations not longer being performed, because of the
attribute relationships? If so, what's the best way to deal with
that? Or am I misunderstanding something?

* I'm looking at making similar changes to a different dimension, but
that dimension has two different user hierarchies that combine the
same attributes in different ways. Does defining a natural hierarchy
through attribute relationships limit your ability to combine those
attributes in different navigational hierarchies?

Thanks,
Kevin














Reply With Quote
  #8  
Old   
Akshai Mirchandani [MS]
 
Posts: n/a

Default Re: Understanding attribute relationships and aggregations in SSAS2005 - 08-21-2006 , 02:00 PM



Quote:
I haven't found much documentation about how those member names are
created, and I want to make sure I understand how we should be
constructing them for our queries. Anything you can point me to here?
Yes, the unique names do depend on the uniqueness of attributes in the
hierarchy -- so defining attribute relationships will change this.

It is strongly recommended to avoid constructing unique names yourself.
There are certain applications that do so but they have to be very careful
about doing this and I would definitely suggest that you look at other
solutions -- like using the MDSCHEMA_MEMBERS schema rowset to discover the
uniquenames by restricting on their caption/name instead.

There is a very little documentation on this in Books Online here on the
main way you can tune this:
http://msdn2.microsoft.com/en-us/library/ms174900.aspx

MemberUniqueNameStyle
Determines how unique names are generated for members of hierarchies
contained within the cube dimension. This property can have the following
values:

Value Description
Native
Analysis Services automatically determines the unique names of
members. This is the default value.

NamePath
Analysis Services generates a compound name consisting of the
name of each level and the caption of the member.




Basically, you can switch between letting the server decide versus forcing
the server to generate a name-per-level unique name.

But once again, I highly recommend not trying to do this yourself.

Quote:
justice. Fortunately the project REAL and P&P BI guide call it out as key.
I hope the coverage of this area in the BOL is expanded in the future. Or
perhaps a separate MSDN article or white paper - something that
accentuates how important they are, and what the implications of using
them (and not using them) are.
A common complaint and one that is being addressed. It will take a little
time for the documentation to become more easily available but it is
absolutely in the works -- mainly as part of a whitepaper corresponding to
the AS 2000 Performance Guide.

In the meantime, here are a couple of other resources:

http://msdn2.microsoft.com/en-us/library/ms174557.aspx
http://www.sqlserveranalysisservices.com/default.htm

The first is a general (simple) description of attribute relationships. The
second discusses how attribute relationships affects calculations -- this is
a very interesting and subtle area which you may not care about if you are
in more of a data warehouse scenario than a calculation scenario. But it is
a good area to understand.

Additionally the Project REAL docs and various blogs and postings out there
have discussions about this topic.

Thanks,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no rights
Please do not send email directly to this alias. This alias is for newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote

Quote:
OK, thanks for the info, Akshai.

One follow up question on the topic. I only just realized something -
changing the dimension definitions to include attribute relationships
ended up changing the unique names of the dimension members. What was:

[Calendar].[Calendar].[Year].&[2006].&[8].&[18]

without relationships became

[Calendar].[Calendar].[Day].&[2006]&[8]&[18]

with them. We have some code that constructs dimension members for dynamic
queries, and that broke (started returning null values).

I haven't found much documentation about how those member names are
created, and I want to make sure I understand how we should be
constructing them for our queries. Anything you can point me to here?

One last comment about attribute relationships - this seems like such an
important area of dimension design, yet I feel like the BOL don't do it
justice. Fortunately the project REAL and P&P BI guide call it out as key.
I hope the coverage of this area in the BOL is expanded in the future. Or
perhaps a separate MSDN article or white paper - something that
accentuates how important they are, and what the implications of using
them (and not using them) are.


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in message
news:%23SLe78JwGHA.4460 (AT) TK2MSFTNGP04 (DOT) phx.gbl...
figure out which to include, and then do the calculations. I'm sure I'm
not getting that exactly right, but is that sort of the general idea?

Essentially yes. (Although using the word calculations is a little
misleading -- its more like doing joins between attributes)

In other words, the exact same Process Version attribute is used, but in
two different places in the hiearchies. We're not "skipping" levels
here.

In that case you need to define the semantics of Process Version -- it
doesn't usually make sense to have a loop in functional dependencies (one
exception may be 1:1 relationships but attribute relationships don't deal
with that).

You can once again do the same thing as before -- create two attributes
for Process Version, one which is a related attribute of Activity Group
and the other which is related to Activity. Give the two attributes
different semantics though, because their uniqueness and behavior with
respect to the other attributes will be quite different. In one case
Process Version appears to be the key of the dimension, and in the other
it is somewhere in the middle of the hierarchy.

Alternatively, you can go with one of the hierarchies being natural and
the other being unnatural -- but it really depends on what you are trying
to do here...

Thanks,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.


"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uXBtNqIwGHA.2204 (AT) TK2MSFTNGP03 (DOT) phx.gbl...
A hierarchy built on top of a natural hierarchy can materialize the
children of each parent....

Um..ok, I think I maybe kinda sorta get what you're saying. As an
example, when attribute relationships are in place and I'm drilling down
into the hour level of July 15, 2006, SSAS knows the 24 specific hours
that belong to that day, and only needs to compute the cells for that
subset. But when there are no relationships, it needs to scan through
all hours of all days, figure out which to include, and then do the
calculations. I'm sure I'm not getting that exactly right, but is that
sort of the general idea?

A natural user hierarchy can skip attributes along the way of the
relationship chain.

Apologies, Akshai, I was both imprecise and inaccurate in my original
description of our hiearchies. Let me try this one more time. They are
actually:

System > Process > Process Version > Activity Group > Activity

and

System > Process > Activity Group > Activity > Process Version

In other words, the exact same Process Version attribute is used, but in
two different places in the hiearchies. We're not "skipping" levels
here.


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in
message news:OOEuN1AwGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
More answers:

Is there any more you can say here? This is the situation for me -
after making the user hierarchy into a natural hierachy, drilling down
sped up so dramatically it was amazing. But it all seems a bit like
magic. If disk-based aggregations aren't to thank, then what?

A hierarchy built on top of a natural hierarchy can materialize the
children of each parent. So a request to the storage engine that asks
(for example) for the children of a member can simply ask for an
iterator over a pre-materialized sub-tree of the children. But a
hierarchy that is built on an unnatural hierarchy has to send queries
to the "raw" dimension data, do ordering on top of the result and
materialize the sub-tree "on the fly". So unnatural hierarchies are
much slower to iterate over for some common set operations.

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the
following hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

The combination above looks like this in terms of attribute
relationships:

System (1,2)
|
Process (1,2)
|
Version (1)
|
Activity Group (1,2)
|
Activity (1,2)
|
Activity Version (2)

A natural user hierarchy can skip attributes along the way of the
relationship chain. So hierarchy (1) and (2) can have levels as shown
above and still be natural.

HTH,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:evcPvPAwGHA.4756 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Thanks Akshai! Great info.

So a query that happened to fetch data at the hours level could make a
later query at the days level really fast because the cache for the
hour-level data can be used to aggregate into the days-level data.

In my case, I was navigating down the hierarchy from year to minute
(high-to-low-level), so I'm assuming your example doesn't apply here.

Additionally, access to the user hierarchy becomes much faster if it
is natural -- and depending on the queries this can sometimes lead to
huge improvements in performance.

Is there any more you can say here? This is the situation for me -
after making the user hierarchy into a natural hierachy, drilling down
sped up so dramatically it was amazing. But it all seems a bit like
magic. If disk-based aggregations aren't to thank, then what?

2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.

Looking around the AdvWorks cube, I noticed that pattern. I'll keep it
in mind. I just wanted to make sure I was understanding things
correctly.

It would help to see an example of this problem.

OK, in this case we have a set of business processes, each of which
are made up activities. Processes can be versioned, so each activity
has a version associated with it. One user hierarchy looks something
this:

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the
following hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

This lets you easily compare at two different versions of the same
activity.

So you see this is actually the same set of attributes, stacked in
different ways. I'm not sure how I should design the natural
hierarchies here (if at all).



"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in
message news:%23kF2q$9vGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Some answers:

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform
so well?

The benefits of good attribute relationships are in no way limited
purely to disk-based aggregations.

Once you have the attribute relationships, in-memory caches of data
can also be used for aggregating higher levels of data. So a query
that happened to fetch data at the hours level could make a later
query at the days level really fast because the cache for the
hour-level data can be used to aggregate into the days-level data.

Additionally, access to the user hierarchy becomes much faster if it
is natural -- and depending on the queries this can sometimes lead to
huge improvements in performance.

* I'm assuming that by setting up these relationships, I'm giving up
some flexibility in my queries. Is that true? For example, what if I
want to compare how January compares to February, across all years.
Are those aggregations not longer being performed, because of the
attribute relationships? If so, what's the best way to deal with
that? Or am I misunderstanding something?

In a sense, yes you are losing some flexibility -- now February isn't
an entity by itself, but rather each instance of February is strongly
associated with a year. However, you can still obtain the results you
need with different solutions:
1. You can quite easily write a calculation to aggregate over the 1st
child of each year and the 2nd child of each year.
2. You can have another attribute (Month of Year) which is related to
Month (or to Day) and query by that attribute hierarchy instead.

The second solution is the simplest but you risk making your model
somewhat more complex by adding more attributes.

* I'm looking at making similar changes to a different dimension,
but that dimension has two different user hierarchies that combine
the same attributes in different ways. Does defining a natural
hierarchy through attribute relationships limit your ability to
combine those attributes in different navigational hierarchies?

It would help to see an example of this problem.

Generally its okay if the attributes are combined together in
different ways -- but you should make sure that they follow the same
rules and semantics when creating the relationships. Make sure that
the meaning of each attribute is the same in each hierarchy and the
functional dependencies hold true and then it should turn out right.

In some cases, you may decide to actually split up attributes (like
Month and Month Of Year) because they really mean different things in
the two hierarchies. But that is a choice and design decision that
you need to make based on what kind of behavior you want to obtain.

HTH,
Akshai

--
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uB62pAWvGHA.4444 (AT) TK2MSFTNGP05 (DOT) phx.gbl...
(Cross posted from MSDN forums, no response there yet)

I'm working on fixing some major query performance problems in an
AS2005 cube. My suspicion was that they were caused by the original
cube design not specifying attribute relationships in the time
dimension. I've read the Project REAL stuff and the P&P BI
Guidelines in an effort to fully grok the implications of attribute
relationships, natural hierarchies, and aggregations, but there's
still some stuff I'm unclear on.

Our calendar hierarchy is moderately deep - Year > Month > Day
Hour > Minute (where minute is actually a 15 minute time window).
The dimension table columns are the key, minute (0,15,30,45), hour
of day (0-23), day of month (0-days in month), month (1-12), year.
Originally there were attributes defined for each of these columns,
with each attribute using its corresponding column as its KeyColumn,
and with all attributes related directly to the key. A user
hierarchy provided navigation from the year down to the minute.
Performance, in a word, sucked (drilling down in the cube browser to
a low level in the hierarchy could take as long as 45 minutes!).

I changed the KeyColumn for each attribute below year to be a
composite of each higher level attribute (so the KeyColumns in Month
were Year and Month, for Day they were Year/Month/Day, etc). Then I
set the attribute relationships accordingly. I'm not sure if this is
the most efficient way to define the attributes, but it seemed to be
the only way to do it with our current dimension table schema.

Once I made this change and re-processed, drilling through the
hierarchy went from 45 minutes to less than a second. Not a bad
improvement. However, there's still some things that I'm unclear
on.

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform
so well?

* I'm assuming that by setting up these relationships, I'm giving up
some flexibility in my queries. Is that true? For example, what if I
want to compare how January compares to February, across all years.
Are those aggregations not longer being performed, because of the
attribute relationships? If so, what's the best way to deal with
that? Or am I misunderstanding something?

* I'm looking at making similar changes to a different dimension,
but that dimension has two different user hierarchies that combine
the same attributes in different ways. Does defining a natural
hierarchy through attribute relationships limit your ability to
combine those attributes in different navigational hierarchies?

Thanks,
Kevin
















Reply With Quote
  #9  
Old   
Kevin Dente
 
Posts: n/a

Default Re: Understanding attribute relationships and aggregations in SSAS2005 - 08-21-2006 , 03:13 PM



Quote:
It is strongly recommended to avoid constructing unique names yourself.
I didn't realize it was a no-no.

Quote:
like using the MDSCHEMA_MEMBERS schema rowset to discover the uniquenames
by restricting on their caption/name instead.
Hmm, I'm not familiar with that technique. If I want to find a unique name
for a specific member deep in the hierarchy (at the hour level, say - a
specific hour of a specific level of a specific year), do I need to do an
MDSCHEMA_MEMBERS query for each level down, or is there a way to do it all
in one shot?

Quote:
MemberUniqueNameStyle
I played around with this option, but it doesn't really help in our case.
After setting it to NamePath, the resulting member unique name is
[Calendar].[Calendar].[All].[2005].[July].[4].[0].[15]

which is still different than that our old format, so either way we have to
change this. Is there any performance implication to which name style we
choose? Or any other reason to choose one or the other?

Quote:
mainly as part of a whitepaper corresponding to the AS 2000 Performance
Guide.
I look forward to it with great anticipation.

Quote:
The second discusses how attribute relationships affects calculations
I'll check it out. One comment - it seems like almost all the discussions of
attribute relationships "out there" use the same, relatively simple
example - geography. It would be nice to see some discussion of other types
of dimensions where relationships are important, and what the ramifications
of them are (like with my calendar questions earlier).


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote

Quote:
I haven't found much documentation about how those member names are
created, and I want to make sure I understand how we should be
constructing them for our queries. Anything you can point me to here?

Yes, the unique names do depend on the uniqueness of attributes in the
hierarchy -- so defining attribute relationships will change this.

It is strongly recommended to avoid constructing unique names yourself.
There are certain applications that do so but they have to be very careful
about doing this and I would definitely suggest that you look at other
solutions -- like using the MDSCHEMA_MEMBERS schema rowset to discover the
uniquenames by restricting on their caption/name instead.

There is a very little documentation on this in Books Online here on the
main way you can tune this:
http://msdn2.microsoft.com/en-us/library/ms174900.aspx

MemberUniqueNameStyle
Determines how unique names are generated for members of hierarchies
contained within the cube dimension. This property can have the following
values:

Value Description
Native
Analysis Services automatically determines the unique names of
members. This is the default value.

NamePath
Analysis Services generates a compound name consisting of the
name of each level and the caption of the member.




Basically, you can switch between letting the server decide versus forcing
the server to generate a name-per-level unique name.

But once again, I highly recommend not trying to do this yourself.

justice. Fortunately the project REAL and P&P BI guide call it out as
key. I hope the coverage of this area in the BOL is expanded in the
future. Or perhaps a separate MSDN article or white paper - something
that accentuates how important they are, and what the implications of
using them (and not using them) are.

A common complaint and one that is being addressed. It will take a little
time for the documentation to become more easily available but it is
absolutely in the works -- mainly as part of a whitepaper corresponding to
the AS 2000 Performance Guide.

In the meantime, here are a couple of other resources:

http://msdn2.microsoft.com/en-us/library/ms174557.aspx
http://www.sqlserveranalysisservices.com/default.htm

The first is a general (simple) description of attribute relationships.
The second discusses how attribute relationships affects calculations --
this is a very interesting and subtle area which you may not care about if
you are in more of a data warehouse scenario than a calculation scenario.
But it is a good area to understand.

Additionally the Project REAL docs and various blogs and postings out
there have discussions about this topic.

Thanks,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:eL$BbjywGHA.4872 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
OK, thanks for the info, Akshai.

One follow up question on the topic. I only just realized something -
changing the dimension definitions to include attribute relationships
ended up changing the unique names of the dimension members. What was:

[Calendar].[Calendar].[Year].&[2006].&[8].&[18]

without relationships became

[Calendar].[Calendar].[Day].&[2006]&[8]&[18]

with them. We have some code that constructs dimension members for
dynamic queries, and that broke (started returning null values).

I haven't found much documentation about how those member names are
created, and I want to make sure I understand how we should be
constructing them for our queries. Anything you can point me to here?

One last comment about attribute relationships - this seems like such an
important area of dimension design, yet I feel like the BOL don't do it
justice. Fortunately the project REAL and P&P BI guide call it out as
key. I hope the coverage of this area in the BOL is expanded in the
future. Or perhaps a separate MSDN article or white paper - something
that accentuates how important they are, and what the implications of
using them (and not using them) are.


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in message
news:%23SLe78JwGHA.4460 (AT) TK2MSFTNGP04 (DOT) phx.gbl...
figure out which to include, and then do the calculations. I'm sure I'm
not getting that exactly right, but is that sort of the general idea?

Essentially yes. (Although using the word calculations is a little
misleading -- its more like doing joins between attributes)

In other words, the exact same Process Version attribute is used, but
in two different places in the hiearchies. We're not "skipping" levels
here.

In that case you need to define the semantics of Process Version -- it
doesn't usually make sense to have a loop in functional dependencies
(one exception may be 1:1 relationships but attribute relationships
don't deal with that).

You can once again do the same thing as before -- create two attributes
for Process Version, one which is a related attribute of Activity Group
and the other which is related to Activity. Give the two attributes
different semantics though, because their uniqueness and behavior with
respect to the other attributes will be quite different. In one case
Process Version appears to be the key of the dimension, and in the other
it is somewhere in the middle of the hierarchy.

Alternatively, you can go with one of the hierarchies being natural and
the other being unnatural -- but it really depends on what you are
trying to do here...

Thanks,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.


"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uXBtNqIwGHA.2204 (AT) TK2MSFTNGP03 (DOT) phx.gbl...
A hierarchy built on top of a natural hierarchy can materialize the
children of each parent....

Um..ok, I think I maybe kinda sorta get what you're saying. As an
example, when attribute relationships are in place and I'm drilling
down into the hour level of July 15, 2006, SSAS knows the 24 specific
hours that belong to that day, and only needs to compute the cells for
that subset. But when there are no relationships, it needs to scan
through all hours of all days, figure out which to include, and then do
the calculations. I'm sure I'm not getting that exactly right, but is
that sort of the general idea?

A natural user hierarchy can skip attributes along the way of the
relationship chain.

Apologies, Akshai, I was both imprecise and inaccurate in my original
description of our hiearchies. Let me try this one more time. They are
actually:

System > Process > Process Version > Activity Group > Activity

and

System > Process > Activity Group > Activity > Process Version

In other words, the exact same Process Version attribute is used, but
in two different places in the hiearchies. We're not "skipping" levels
here.


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in
message news:OOEuN1AwGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
More answers:

Is there any more you can say here? This is the situation for me -
after making the user hierarchy into a natural hierachy, drilling
down sped up so dramatically it was amazing. But it all seems a bit
like magic. If disk-based aggregations aren't to thank, then what?

A hierarchy built on top of a natural hierarchy can materialize the
children of each parent. So a request to the storage engine that asks
(for example) for the children of a member can simply ask for an
iterator over a pre-materialized sub-tree of the children. But a
hierarchy that is built on an unnatural hierarchy has to send queries
to the "raw" dimension data, do ordering on top of the result and
materialize the sub-tree "on the fly". So unnatural hierarchies are
much slower to iterate over for some common set operations.

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the
following hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

The combination above looks like this in terms of attribute
relationships:

System (1,2)
|
Process (1,2)
|
Version (1)
|
Activity Group (1,2)
|
Activity (1,2)
|
Activity Version (2)

A natural user hierarchy can skip attributes along the way of the
relationship chain. So hierarchy (1) and (2) can have levels as shown
above and still be natural.

HTH,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:evcPvPAwGHA.4756 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Thanks Akshai! Great info.

So a query that happened to fetch data at the hours level could make
a later query at the days level really fast because the cache for the
hour-level data can be used to aggregate into the days-level data.

In my case, I was navigating down the hierarchy from year to minute
(high-to-low-level), so I'm assuming your example doesn't apply here.

Additionally, access to the user hierarchy becomes much faster if it
is natural -- and depending on the queries this can sometimes lead
to huge improvements in performance.

Is there any more you can say here? This is the situation for me -
after making the user hierarchy into a natural hierachy, drilling
down sped up so dramatically it was amazing. But it all seems a bit
like magic. If disk-based aggregations aren't to thank, then what?

2. You can have another attribute (Month of Year) which is related
to Month (or to Day) and query by that attribute hierarchy instead.

Looking around the AdvWorks cube, I noticed that pattern. I'll keep
it in mind. I just wanted to make sure I was understanding things
correctly.

It would help to see an example of this problem.

OK, in this case we have a set of business processes, each of which
are made up activities. Processes can be versioned, so each activity
has a version associated with it. One user hierarchy looks something
this:

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the
following hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

This lets you easily compare at two different versions of the same
activity.

So you see this is actually the same set of attributes, stacked in
different ways. I'm not sure how I should design the natural
hierarchies here (if at all).



"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in
message news:%23kF2q$9vGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Some answers:

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform
so well?

The benefits of good attribute relationships are in no way limited
purely to disk-based aggregations.

Once you have the attribute relationships, in-memory caches of data
can also be used for aggregating higher levels of data. So a query
that happened to fetch data at the hours level could make a later
query at the days level really fast because the cache for the
hour-level data can be used to aggregate into the days-level data.

Additionally, access to the user hierarchy becomes much faster if it
is natural -- and depending on the queries this can sometimes lead
to huge improvements in performance.

* I'm assuming that by setting up these relationships, I'm giving
up some flexibility in my queries. Is that true? For example, what
if I want to compare how January compares to February, across all
years. Are those aggregations not longer being performed, because
of the attribute relationships? If so, what's the best way to deal
with that? Or am I misunderstanding something?

In a sense, yes you are losing some flexibility -- now February
isn't an entity by itself, but rather each instance of February is
strongly associated with a year. However, you can still obtain the
results you need with different solutions:
1. You can quite easily write a calculation to aggregate over the
1st child of each year and the 2nd child of each year.
2. You can have another attribute (Month of Year) which is related
to Month (or to Day) and query by that attribute hierarchy instead.

The second solution is the simplest but you risk making your model
somewhat more complex by adding more attributes.

* I'm looking at making similar changes to a different dimension,
but that dimension has two different user hierarchies that combine
the same attributes in different ways. Does defining a natural
hierarchy through attribute relationships limit your ability to
combine those attributes in different navigational hierarchies?

It would help to see an example of this problem.

Generally its okay if the attributes are combined together in
different ways -- but you should make sure that they follow the same
rules and semantics when creating the relationships. Make sure that
the meaning of each attribute is the same in each hierarchy and the
functional dependencies hold true and then it should turn out right.

In some cases, you may decide to actually split up attributes (like
Month and Month Of Year) because they really mean different things
in the two hierarchies. But that is a choice and design decision
that you need to make based on what kind of behavior you want to
obtain.

HTH,
Akshai

--
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uB62pAWvGHA.4444 (AT) TK2MSFTNGP05 (DOT) phx.gbl...
(Cross posted from MSDN forums, no response there yet)

I'm working on fixing some major query performance problems in an
AS2005 cube. My suspicion was that they were caused by the original
cube design not specifying attribute relationships in the time
dimension. I've read the Project REAL stuff and the P&P BI
Guidelines in an effort to fully grok the implications of attribute
relationships, natural hierarchies, and aggregations, but there's
still some stuff I'm unclear on.

Our calendar hierarchy is moderately deep - Year > Month > Day
Hour > Minute (where minute is actually a 15 minute time window).
The dimension table columns are the key, minute (0,15,30,45), hour
of day (0-23), day of month (0-days in month), month (1-12), year.
Originally there were attributes defined for each of these columns,
with each attribute using its corresponding column as its
KeyColumn, and with all attributes related directly to the key. A
user hierarchy provided navigation from the year down to the
minute. Performance, in a word, sucked (drilling down in the cube
browser to a low level in the hierarchy could take as long as 45
minutes!).

I changed the KeyColumn for each attribute below year to be a
composite of each higher level attribute (so the KeyColumns in
Month were Year and Month, for Day they were Year/Month/Day, etc).
Then I set the attribute relationships accordingly. I'm not sure if
this is the most efficient way to define the attributes, but it
seemed to be the only way to do it with our current dimension table
schema.

Once I made this change and re-processed, drilling through the
hierarchy went from 45 minutes to less than a second. Not a bad
improvement. However, there's still some things that I'm unclear
on.

* The performance seems to be about the same (that is, pretty fast)
whether the aggregations are set to 0% or 70%. That confuses me - I
thought that setting up the relationships is what allows the
aggregations to be computed. But why would no aggregations perform
so well?

* I'm assuming that by setting up these relationships, I'm giving
up some flexibility in my queries. Is that true? For example, what
if I want to compare how January compares to February, across all
years. Are those aggregations not longer being performed, because
of the attribute relationships? If so, what's the best way to deal
with that? Or am I misunderstanding something?

* I'm looking at making similar changes to a different dimension,
but that dimension has two different user hierarchies that combine
the same attributes in different ways. Does defining a natural
hierarchy through attribute relationships limit your ability to
combine those attributes in different navigational hierarchies?

Thanks,
Kevin


















Reply With Quote
  #10  
Old   
Akshai Mirchandani [MS]
 
Posts: n/a

Default Re: Understanding attribute relationships and aggregations in SSAS2005 - 08-21-2006 , 08:43 PM



Yes, creating unique names yourself has always been discouraged for this
reason (and others). In fact, I believe it is even in the OLE DB for OLAP
spec.

But lets be clear about this: unique names are going to be permanent as long
as you maintain the reliability of the model -- if the model changes (like
attribute relationships, or levels in the hierarchy), or you change the
names/keys of the members, then you may find that unique names are affected.

So the discouragement is simply that you avoid creating uniquenames yourself
in case your application doesn't detect such changes. But once your model is
"stable", then you are safe to save uniquenames that were constructed by the
server for you.

Quote:
Hmm, I'm not familiar with that technique. If I want to find a unique name
for a specific member deep in the hierarchy (at the hour level, say - a
specific hour of a specific level of a specific year), do I need to do an
MDSCHEMA_MEMBERS query for each level down, or is there a way to do it all
in one shot?
There are a few ways you can do this. The simplest is, as you mentioned
below, to send multiple discover requests -- one for each level.

If you actually need many descendants at a level (e.g. All hours of a
specific year) then you can use the TREE_OP restriction in combination with
the MEMBER_CAPTION restriction to obtain the descendants of a year at the
Hour level.

Unfortunately, you can't get exactly what you want in a single discover
request -- unless you actually fetch all the members at that level and above
and traverse the tree of descendant members yourself in your client
application.

Alternatively, you could also use MDX to do this. In fact, you should be
able to construct a qualified name like this:
SELECT
{} ON 0,
[TimeHierarchy].[Year1].[Month1].[Date1].[Hour1] ON 1
FROM [cube]
and this should give you the unique name of the member. Qualified names work
fine -- but rely on the structure of the hierarchy (and the member names)
not changing.

You can also do a Filter( member.Children, Caption = "a" ) on each level
although this could be a little expensive...

Quote:
I'll check it out. One comment - it seems like almost all the discussions
of attribute relationships "out there" use the same, relatively simple
example - geography. It would be nice to see some discussion of other
types of dimensions where relationships are important, and what the
ramifications of them are (like with my calendar questions earlier).
True, and I'll try to pass that on to the folks working on this.

Thanks,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no rights
Please do not send email directly to this alias. This alias is for newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote

Quote:
It is strongly recommended to avoid constructing unique names yourself.

I didn't realize it was a no-no.

like using the MDSCHEMA_MEMBERS schema rowset to discover the uniquenames
by restricting on their caption/name instead.

Hmm, I'm not familiar with that technique. If I want to find a unique name
for a specific member deep in the hierarchy (at the hour level, say - a
specific hour of a specific level of a specific year), do I need to do an
MDSCHEMA_MEMBERS query for each level down, or is there a way to do it all
in one shot?

MemberUniqueNameStyle

I played around with this option, but it doesn't really help in our case.
After setting it to NamePath, the resulting member unique name is
[Calendar].[Calendar].[All].[2005].[July].[4].[0].[15]

which is still different than that our old format, so either way we have
to change this. Is there any performance implication to which name style
we choose? Or any other reason to choose one or the other?

mainly as part of a whitepaper corresponding to the AS 2000 Performance
Guide.

I look forward to it with great anticipation.

The second discusses how attribute relationships affects calculations

I'll check it out. One comment - it seems like almost all the discussions
of attribute relationships "out there" use the same, relatively simple
example - geography. It would be nice to see some discussion of other
types of dimensions where relationships are important, and what the
ramifications of them are (like with my calendar questions earlier).


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in message
news:O%2349HQVxGHA.4960 (AT) TK2MSFTNGP04 (DOT) phx.gbl...
I haven't found much documentation about how those member names are
created, and I want to make sure I understand how we should be
constructing them for our queries. Anything you can point me to here?

Yes, the unique names do depend on the uniqueness of attributes in the
hierarchy -- so defining attribute relationships will change this.

It is strongly recommended to avoid constructing unique names yourself.
There are certain applications that do so but they have to be very
careful about doing this and I would definitely suggest that you look at
other solutions -- like using the MDSCHEMA_MEMBERS schema rowset to
discover the uniquenames by restricting on their caption/name instead.

There is a very little documentation on this in Books Online here on the
main way you can tune this:
http://msdn2.microsoft.com/en-us/library/ms174900.aspx

MemberUniqueNameStyle
Determines how unique names are generated for members of hierarchies
contained within the cube dimension. This property can have the following
values:

Value Description
Native
Analysis Services automatically determines the unique names of
members. This is the default value.

NamePath
Analysis Services generates a compound name consisting of the
name of each level and the caption of the member.




Basically, you can switch between letting the server decide versus
forcing the server to generate a name-per-level unique name.

But once again, I highly recommend not trying to do this yourself.

justice. Fortunately the project REAL and P&P BI guide call it out as
key. I hope the coverage of this area in the BOL is expanded in the
future. Or perhaps a separate MSDN article or white paper - something
that accentuates how important they are, and what the implications of
using them (and not using them) are.

A common complaint and one that is being addressed. It will take a little
time for the documentation to become more easily available but it is
absolutely in the works -- mainly as part of a whitepaper corresponding
to the AS 2000 Performance Guide.

In the meantime, here are a couple of other resources:

http://msdn2.microsoft.com/en-us/library/ms174557.aspx
http://www.sqlserveranalysisservices.com/default.htm

The first is a general (simple) description of attribute relationships.
The second discusses how attribute relationships affects calculations --
this is a very interesting and subtle area which you may not care about
if you are in more of a data warehouse scenario than a calculation
scenario. But it is a good area to understand.

Additionally the Project REAL docs and various blogs and postings out
there have discussions about this topic.

Thanks,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:eL$BbjywGHA.4872 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
OK, thanks for the info, Akshai.

One follow up question on the topic. I only just realized something -
changing the dimension definitions to include attribute relationships
ended up changing the unique names of the dimension members. What was:

[Calendar].[Calendar].[Year].&[2006].&[8].&[18]

without relationships became

[Calendar].[Calendar].[Day].&[2006]&[8]&[18]

with them. We have some code that constructs dimension members for
dynamic queries, and that broke (started returning null values).

I haven't found much documentation about how those member names are
created, and I want to make sure I understand how we should be
constructing them for our queries. Anything you can point me to here?

One last comment about attribute relationships - this seems like such an
important area of dimension design, yet I feel like the BOL don't do it
justice. Fortunately the project REAL and P&P BI guide call it out as
key. I hope the coverage of this area in the BOL is expanded in the
future. Or perhaps a separate MSDN article or white paper - something
that accentuates how important they are, and what the implications of
using them (and not using them) are.


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in
message news:%23SLe78JwGHA.4460 (AT) TK2MSFTNGP04 (DOT) phx.gbl...
figure out which to include, and then do the calculations. I'm sure
I'm not getting that exactly right, but is that sort of the general
idea?

Essentially yes. (Although using the word calculations is a little
misleading -- its more like doing joins between attributes)

In other words, the exact same Process Version attribute is used, but
in two different places in the hiearchies. We're not "skipping" levels
here.

In that case you need to define the semantics of Process Version -- it
doesn't usually make sense to have a loop in functional dependencies
(one exception may be 1:1 relationships but attribute relationships
don't deal with that).

You can once again do the same thing as before -- create two attributes
for Process Version, one which is a related attribute of Activity Group
and the other which is related to Activity. Give the two attributes
different semantics though, because their uniqueness and behavior with
respect to the other attributes will be quite different. In one case
Process Version appears to be the key of the dimension, and in the
other it is somewhere in the middle of the hierarchy.

Alternatively, you can go with one of the hierarchies being natural and
the other being unnatural -- but it really depends on what you are
trying to do here...

Thanks,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.


"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uXBtNqIwGHA.2204 (AT) TK2MSFTNGP03 (DOT) phx.gbl...
A hierarchy built on top of a natural hierarchy can materialize the
children of each parent....

Um..ok, I think I maybe kinda sorta get what you're saying. As an
example, when attribute relationships are in place and I'm drilling
down into the hour level of July 15, 2006, SSAS knows the 24 specific
hours that belong to that day, and only needs to compute the cells for
that subset. But when there are no relationships, it needs to scan
through all hours of all days, figure out which to include, and then
do the calculations. I'm sure I'm not getting that exactly right, but
is that sort of the general idea?

A natural user hierarchy can skip attributes along the way of the
relationship chain.

Apologies, Akshai, I was both imprecise and inaccurate in my original
description of our hiearchies. Let me try this one more time. They are
actually:

System > Process > Process Version > Activity Group > Activity

and

System > Process > Activity Group > Activity > Process Version

In other words, the exact same Process Version attribute is used, but
in two different places in the hiearchies. We're not "skipping" levels
here.


"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in
message news:OOEuN1AwGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
More answers:

Is there any more you can say here? This is the situation for me -
after making the user hierarchy into a natural hierachy, drilling
down sped up so dramatically it was amazing. But it all seems a bit
like magic. If disk-based aggregations aren't to thank, then what?

A hierarchy built on top of a natural hierarchy can materialize the
children of each parent. So a request to the storage engine that asks
(for example) for the children of a member can simply ask for an
iterator over a pre-materialized sub-tree of the children. But a
hierarchy that is built on an unnatural hierarchy has to send queries
to the "raw" dimension data, do ordering on top of the result and
materialize the sub-tree "on the fly". So unnatural hierarchies are
much slower to iterate over for some common set operations.

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the
following hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

The combination above looks like this in terms of attribute
relationships:

System (1,2)
|
Process (1,2)
|
Version (1)
|
Activity Group (1,2)
|
Activity (1,2)
|
Activity Version (2)

A natural user hierarchy can skip attributes along the way of the
relationship chain. So hierarchy (1) and (2) can have levels as shown
above and still be natural.

HTH,
Akshai
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:evcPvPAwGHA.4756 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Thanks Akshai! Great info.

So a query that happened to fetch data at the hours level could make
a later query at the days level really fast because the cache for
the hour-level data can be used to aggregate into the days-level
data.

In my case, I was navigating down the hierarchy from year to minute
(high-to-low-level), so I'm assuming your example doesn't apply
here.

Additionally, access to the user hierarchy becomes much faster if
it is natural -- and depending on the queries this can sometimes
lead to huge improvements in performance.

Is there any more you can say here? This is the situation for me -
after making the user hierarchy into a natural hierachy, drilling
down sped up so dramatically it was amazing. But it all seems a bit
like magic. If disk-based aggregations aren't to thank, then what?

2. You can have another attribute (Month of Year) which is related
to Month (or to Day) and query by that attribute hierarchy instead.

Looking around the AdvWorks cube, I noticed that pattern. I'll keep
it in mind. I just wanted to make sure I was understanding things
correctly.

It would help to see an example of this problem.

OK, in this case we have a set of business processes, each of which
are made up activities. Processes can be versioned, so each activity
has a version associated with it. One user hierarchy looks something
this:

System > Process > Version > Activity Group > Activity

This lets you, for example, examine all activities in a particular
version of a process.

However, for comparitive analysis, it's also useful to have the
following hierarchy as well:

System > Process > Activity Group > Activity > Activity Version

This lets you easily compare at two different versions of the same
activity.

So you see this is actually the same set of attributes, stacked in
different ways. I'm not sure how I should design the natural
hierarchies here (if at all).



"Akshai Mirchandani [MS]" <akshaim (AT) online (DOT) microsoft.com> wrote in
message news:%23kF2q$9vGHA.3372 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Some answers:

* The performance seems to be about the same (that is, pretty
fast) whether the aggregations are set to 0% or 70%. That confuses
me - I thought that setting up the relationships is what allows
the aggregations to be computed. But why would no aggregations
perform so well?

The benefits of good attribute relationships are in no way limited
purely to disk-based aggregations.

Once you have the attribute relationships, in-memory caches of data
can also be used for aggregating higher levels of data. So a query
that happened to fetch data at the hours level could make a later
query at the days level really fast because the cache for the
hour-level data can be used to aggregate into the days-level data.

Additionally, access to the user hierarchy becomes much faster if
it is natural -- and depending on the queries this can sometimes
lead to huge improvements in performance.

* I'm assuming that by setting up these relationships, I'm giving
up some flexibility in my queries. Is that true? For example, what
if I want to compare how January compares to February, across all
years. Are those aggregations not longer being performed, because
of the attribute relationships? If so, what's the best way to deal
with that? Or am I misunderstanding something?

In a sense, yes you are losing some flexibility -- now February
isn't an entity by itself, but rather each instance of February is
strongly associated with a year. However, you can still obtain the
results you need with different solutions:
1. You can quite easily write a calculation to aggregate over the
1st child of each year and the 2nd child of each year.
2. You can have another attribute (Month of Year) which is related
to Month (or to Day) and query by that attribute hierarchy instead.

The second solution is the simplest but you risk making your model
somewhat more complex by adding more attributes.

* I'm looking at making similar changes to a different dimension,
but that dimension has two different user hierarchies that combine
the same attributes in different ways. Does defining a natural
hierarchy through attribute relationships limit your ability to
combine those attributes in different navigational hierarchies?

It would help to see an example of this problem.

Generally its okay if the attributes are combined together in
different ways -- but you should make sure that they follow the
same rules and semantics when creating the relationships. Make sure
that the meaning of each attribute is the same in each hierarchy
and the functional dependencies hold true and then it should turn
out right.

In some cases, you may decide to actually split up attributes (like
Month and Month Of Year) because they really mean different things
in the two hierarchies. But that is a choice and design decision
that you need to make based on what kind of behavior you want to
obtain.

HTH,
Akshai

--
--
Try out the MSDN Forums for Analysis Services at:
http://forums.microsoft.com/MSDN/Sho...ID=83&SiteID=1

This posting is provided "AS IS" with no warranties, and confers no
rights
Please do not send email directly to this alias. This alias is for
newsgroup
purposes only.

"Kevin Dente" <kevind (AT) no (DOT) spam.denteworld.com> wrote in message
news:uB62pAWvGHA.4444 (AT) TK2MSFTNGP05 (DOT) phx.gbl...
(Cross posted from MSDN forums, no response there yet)

I'm working on fixing some major query performance problems in an
AS2005 cube. My suspicion was that they were caused by the
original cube design not specifying attribute relationships in the
time dimension. I've read the Project REAL stuff and the P&P BI
Guidelines in an effort to fully grok the implications of
attribute relationships, natural hierarchies, and aggregations,
but there's still some stuff I'm unclear on.

Our calendar hierarchy is moderately deep - Year > Month > Day
Hour > Minute (where minute is actually a 15 minute time window).
The dimension table columns are the key, minute (0,15,30,45), hour
of day (0-23), day of month (0-days in month), month (1-12), year.
Originally there were attributes defined for each of these
columns, with each attribute using its corresponding column as its
KeyColumn, and with all attributes related directly to the key. A
user hierarchy provided navigation from the year down to the
minute. Performance, in a word, sucked (drilling down in the cube
browser to a low level in the hierarchy could take as long as 45
minutes!).

I changed the KeyColumn for each attribute below year to be a
composite of each higher level attribute (so the KeyColumns in
Month were Year and Month, for Day they were Year/Month/Day, etc).
Then I set the attribute relationships accordingly. I'm not sure
if this is the most efficient way to define the attributes, but it
seemed to be the only way to do it with our current dimension
table schema.

Once I made this change and re-processed, drilling through the
hierarchy went from 45 minutes to less than a second. Not a bad
improvement. However, there's still some things that I'm
unclear on.

* The performance seems to be about the same (that is, pretty
fast) whether the aggregations are set to 0% or 70%. That confuses
me - I thought that setting up the relationships is what allows
the aggregations to be computed. But why would no aggregations
perform so well?

* I'm assuming that by setting up these relationships, I'm giving
up some flexibility in my queries. Is that true? For example, what
if I want to compare how January compares to February, across all
years. Are those aggregations not longer being performed, because
of the attribute relationships? If so, what's the best way to deal
with that? Or am I misunderstanding something?

* I'm looking at making similar changes to a different dimension,
but that dimension has two different user hierarchies that combine
the same attributes in different ways. Does defining a natural
hierarchy through attribute relationships limit your ability to
combine those attributes in different navigational hierarchies?

Thanks,
Kevin




















Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.