dbTalk Databases Forums  

Best design practice for temporal facts...

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss Best design practice for temporal facts... in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Giri
 
Posts: n/a

Default Best design practice for temporal facts... - 05-25-2004 , 04:21 AM






Would really appreciate any of your thoughts on how best to attack this
problem... Links to good online references or books also appreciated.

If you have a fact that changes over time, say the occupied surface area of
some land, how is it best to store that information in the data warehouse so
that cubes can work easily off them.

1) Store the position at any given moment in time.

2) Store the changes in position over time.

any other thoughts..

Bear in mind that we want to:

* Analyse the data over time to see trends
* If route 2 is used, then occupancy at any given moment in time would be
the sum of all transactions to date.. Is kind of calculated cell possible..

I realise the above might be a big vague but I hope that someone is able to
provide some pointers, it has been a while since I have worked with AS and
getting my head around all this is proving to be a challenge !

Thanks


GT



Reply With Quote
  #2  
Old   
Yuan Shao
 
Posts: n/a

Default RE: Best design practice for temporal facts... - 05-25-2004 , 09:45 PM






Hi Giri,

As I understand, you want to store the information in the data warehouse.
The data in the fact table changes often and you want to update the cube in
time.

You may try to use Real-Time cubes in your data warehouse. Real-time OLAP
resolves this issue by enabling ROLAP dimensions and partitions to
automatically refresh themselves when data in their underlying dimension or
fact tables changes. When working in concert with SQL Server 2000 as the
relational data source, Analysis Services can periodically poll the data
source for notifications about updates to dimension or fact tables
associated with specific ROLAP dimensions or partitions enabled for
real-time updates. If the Analysis server finds that a change to a
dimension or fact table has occurred, it can respond to the notifications
by flushing the Analysis server cache and automatically reprocessing the
associated ROLAP dimensions or partitions.

For more information regarding Real-Time Cubes, please refer to the
following article:
Real-Time Cubes
http://msdn.microsoft.com/library/de...us/olapdmad/ag
cubevarieties_0o4z.asp

Best wishes,

Michael Shao
Microsoft Online Partner Support
Get Secure! - www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.



Reply With Quote
  #3  
Old   
Giri
 
Posts: n/a

Default Re: Best design practice for temporal facts... - 05-26-2004 , 04:04 AM



useful information thanks.. But it doesn't really address my question about
how best to store a fact that changes over time to allow

1) Time based analysis
2) Snapshot reporting at any given moment in time.

I haven't made this clear enough so maybe an example would be a good idea...

lets say you have data tracking utilization of a resource. It is easy to
calculate the percentages etc..

Now do you store in a fact table the following columns of data:

DateTimeStamp (When was this data captured, FK to time dimension)
Which Machine
TotalUtilization

This is fine for any analysis that restricts itself to that particular time
slice.. but what happens when you want to start analyzing over different
time periods.. say rollup to a qtr on qtr analysis? the other option is
just to show the change in utilization at any given moment.

DateTimeStamp
WhichMachine
ChangeInUtilization

This way, a sum of all utilizations up to a given point would give you a net
utilization at that moment in time.. and would also allow you to do time
based analysis...

Which of the two, or another, is the better approach?

Thanks


Giri



""Yuan Shao"" <v-yshao (AT) online (DOT) microsoft.com> wrote

Quote:
Hi Giri,

As I understand, you want to store the information in the data warehouse.
The data in the fact table changes often and you want to update the cube
in
time.

You may try to use Real-Time cubes in your data warehouse. Real-time OLAP
resolves this issue by enabling ROLAP dimensions and partitions to
automatically refresh themselves when data in their underlying dimension
or
fact tables changes. When working in concert with SQL Server 2000 as the
relational data source, Analysis Services can periodically poll the data
source for notifications about updates to dimension or fact tables
associated with specific ROLAP dimensions or partitions enabled for
real-time updates. If the Analysis server finds that a change to a
dimension or fact table has occurred, it can respond to the notifications
by flushing the Analysis server cache and automatically reprocessing the
associated ROLAP dimensions or partitions.

For more information regarding Real-Time Cubes, please refer to the
following article:
Real-Time Cubes

http://msdn.microsoft.com/library/de...us/olapdmad/ag
cubevarieties_0o4z.asp

Best wishes,

Michael Shao
Microsoft Online Partner Support
Get Secure! - www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.





Reply With Quote
  #4  
Old   
John Desch [MS]
 
Posts: n/a

Default Re: Best design practice for temporal facts... - 05-28-2004 , 11:51 PM



Hi Giri,

If I understand your post correctly, you want to know how to store facts
that change over time so that those facts can be employed for time series
analysis and/or snapshot analysis.

I'll preface my comments by saying that what data are stored and how those
data points are stored is largely dependent on the business question being
posed. Obviously, there is value in knowing the size of a population in a
specific geographic area at any given point in time. Likewise, there is
value in understanding changes in the size of the population from one
distinct time period to another. In order to examine changes in population
size over time, it's necessary to store the population values for each
distinct time period. The distinct values can then be used to calculate a
number of different values.

For instance, if we have the following dimension table:
TimeID PopTime
1 2003-01-01
2 2003-02-01
3 2003-03-01
4 2003-04-01
5 2003-05-01

With the following fact table
TimeID AreaID PopulationSize
1 1 1500
1 2 15000
2 1 3000
2 2 13500
3 1 4500
3 2 12000
4 1 6000
4 2 12500
5 1 7900
5 2 15000

We can build a cube using the PopulationSize column as a data point for
each point in time and geographic area. We can then use a calculated member
similar to the one below to display changes in the data over time.

[Measures].[PopulationSize]-
IIF(IsGeneration([Time].Currentmember,0),
ParallelPeriod([Time].[Day]),
IIF(IsGeneration([Time].Currentmember,1),
ParallelPeriod([Time].[Month]),
IIF(IsGeneration([Time].Currentmember,2),
ParallelPeriod([Time].[Quarter]),
IIF(IsGeneration([Time].CurrentMember,3),
ParallelPeriod([Time].[Year]),0))))

In this particular MDX expression, we're checking for the generation of the
time dimension level, then subtracting the value for the ParallelPeriod, to
calculate the difference between the two time periods.

Regards,

John Desch, MCDBA, MCSD
Microsoft SQL Analysis Services Support

This posting is provided “AS IS”, with no warranties, and confers no rights.


Reply With Quote
  #5  
Old   
Giri
 
Posts: n/a

Default Re: Best design practice for temporal facts... - 06-01-2004 , 03:41 AM



Thanks John,

This is exactly the kind of advice I was looking for... How would your
calculated measure work if the cube was showing for the whole year however?
You wouldn't want to sum the population size and you wouldn't want the
average either.. what you really want is the most recent fact for the period
being observed. Is this possible using a calculated measure?

Thanks in advance


Giri


"John Desch [MS]" <jdesch (AT) online (DOT) microsoft.com> wrote

Quote:
Hi Giri,

If I understand your post correctly, you want to know how to store facts
that change over time so that those facts can be employed for time series
analysis and/or snapshot analysis.

I'll preface my comments by saying that what data are stored and how those
data points are stored is largely dependent on the business question being
posed. Obviously, there is value in knowing the size of a population in a
specific geographic area at any given point in time. Likewise, there is
value in understanding changes in the size of the population from one
distinct time period to another. In order to examine changes in population
size over time, it's necessary to store the population values for each
distinct time period. The distinct values can then be used to calculate a
number of different values.

For instance, if we have the following dimension table:
TimeID PopTime
1 2003-01-01
2 2003-02-01
3 2003-03-01
4 2003-04-01
5 2003-05-01

With the following fact table
TimeID AreaID PopulationSize
1 1 1500
1 2 15000
2 1 3000
2 2 13500
3 1 4500
3 2 12000
4 1 6000
4 2 12500
5 1 7900
5 2 15000

We can build a cube using the PopulationSize column as a data point for
each point in time and geographic area. We can then use a calculated
member
similar to the one below to display changes in the data over time.

[Measures].[PopulationSize]-
IIF(IsGeneration([Time].Currentmember,0),
ParallelPeriod([Time].[Day]),
IIF(IsGeneration([Time].Currentmember,1),
ParallelPeriod([Time].[Month]),
IIF(IsGeneration([Time].Currentmember,2),
ParallelPeriod([Time].[Quarter]),
IIF(IsGeneration([Time].CurrentMember,3),
ParallelPeriod([Time].[Year]),0))))

In this particular MDX expression, we're checking for the generation of
the
time dimension level, then subtracting the value for the ParallelPeriod,
to
calculate the difference between the two time periods.

Regards,

John Desch, MCDBA, MCSD
Microsoft SQL Analysis Services Support

This posting is provided "AS IS", with no warranties, and confers no
rights.




Reply With Quote
  #6  
Old   
John Desch [MS]
 
Posts: n/a

Default Re: Best design practice for temporal facts... - 06-01-2004 , 11:46 AM



Hi Giri,

Using that particular MDX expression, in a calculated member, would result
in displaying the difference in the Population Size for the equivalent
parallel period. So if we happened to be browsing/querying at the year
level, we'd see the difference in the population from one year to the next.

You're correct, with population data, we'd want to examine the most recent
data. In that case, we'd probably want to consider a couple of different
possibilities. One possibility would be to use another calculated member
with an expression similar to the following:

IIF(IsGeneration(Time.CurrentMember,0),[Measures].[PopulationSize],([Time].C
urrentMember.LastChild,[Measures].[PopulationSize]))

There is a possibility that the value associated with the expression
(Time.CurrentMember.LastChild,[Measures].[PopulationSize]) could be a NULL
rather than a numeric value, so the calculated member would need to check
for nulls.

Another approach is to use a series of calculated cells to return data
values.

In that case, we'd want to build the following calculated cells:
Name = TimeYearly
Calculation SubCube = {[Measures].[PopulationSize]}, [Time].[Year].MEMBERS
Calculation Condition = Applies to entire calculation subcube
Calculation Value = IIF(IsEmpty(Time.CurrentMember.Children(3)),

IIF(IsEmpty(Time.CurrentMember.Children(2)),

IIF(isEmpty(Time.CurrentMember.Children(1)),

IIF(IsEmpty(Time.CurrentMember.Children(0)),

null,Time.CurrentMember.Children(0)),

Time.CurrentMember.Children(1)),

Time.CurrentMember.Children(2)),

time.currentmember.children(3))

Name = TimeQuarterly
Calculation SubCube = {[Measures].[PopulationSize]},
[Time].[Quarter].MEMBERS
Calculation Condition = Applies to entire calculation subcube
Calculation Value = IIF(IsEmpty(Time.CurrentMember.Children(2)),

IIF(isEmpty(Time.CurrentMember.Children(1)),

IIF(IsEmpty(Time.CurrentMember.Children(0)),

null,Time.CurrentMember.Children(0)),

Time.CurrentMember.Children(1)),

Time.CurrentMember.Children(2))

HTH

John Desch, MCDBA, MCSD
Microsoft SQL Analysis Services Support

This posting is provided “AS IS”, with no warranties, and confers no rights.


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.