dbTalk Databases Forums  

Single data source per cube?

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss Single data source per cube? in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
SocSecTrainWreck@earthlink.net
 
Posts: n/a

Default Single data source per cube? - 10-14-2005 , 12:07 PM







As I understand it, there may only be a single data source per cube. Is
there something I am missing here (I'm not clear about how multiple
partitions would affect this since the personal edition of AS does not
allow it, which would mean that you must have the EE to do development
or learning)? This seems like a killer to me, because it means that
underlying relational (or whatever) data must be designed in
anticipation of pulling the data into a cube, which would be
fundamentally in opposition to the whole point of data mining.

I'm aware that there are ridiculous workarounds such as creating views
for all objects within the target database/datasource, but it takes
something fundamental to MS SQL Server- multiple databases on a single
server that are accessible by each other- and throws it out the window.
What's up with that?


Reply With Quote
  #2  
Old   
Darren Gosbell
 
Posts: n/a

Default Re: Single data source per cube? - 10-14-2005 , 06:50 PM







Quote:
which would mean that you must have the EE to do development
or learning)?
The developer edition is the best edition to use for dev or learning as
it is equivalent to Enterprise. But all the partitions must still come
from the one database.

Quote:
This seems like a killer to me, because it means that
underlying relational (or whatever) data must be designed in
anticipation of pulling the data into a cube, which would be
fundamentally in opposition to the whole point of data mining.
AS2k5 lifts this restriction and allows you to pull data from multiple
datasources into the one cube.

Although in AS2k, it's not as silly a restriction as it may seem. Most
of the time when you are dealing with multiple datasources, they are not
all consistent and you need to go through some sort of transformation
step after which you load the cleansed data into a central data
warehouse. In other situations people do not want to query the source
systems directly from AS as this could affect performance.

--
Regards
Darren Gosbell [MCSD]
<dgosbell_at_yahoo_dot_com>
Blog: http://www.geekswithblogs.net/darrengosbell


Reply With Quote
  #3  
Old   
Jéjé
 
Posts: n/a

Default Re: Single data source per cube? - 10-16-2005 , 08:15 AM



its always recommended to use views to populate AS objects (cubes,
dimensions)
these views allows you to insure that you provide the right information to
the cube regarding the databases structure changes.

also, in AS, a cube is created using 1 fact table only and the dimensions
must come from the same database.
Without 1 point of access, how can you insure that the product id 1 is the
same in all databases?

I don't understand why you found this solution is an opposite of a dataming
solution.
A good datamining solution must be done using good & clean data.
Also you must be able to create test data to validate your datamining model.
If you have multiple databases & multiple sources, creating a good model
starts to become a real problem for your analysts.

what you found a limitation is exactly what a datawarewhouse must be:
having 1 point of access for the information with cleansed & easy to use
data, for data mining, olap analysis or reporting. to insure that all the
users access the same information with good performance & shared metadata.

Also, I have a question for you, you have multiple databases on the same
server... does this mean that your databases have the same schema & the same
information?
why you don't merge your databases?


<SocSecTrainWreck (AT) earthlink (DOT) net> wrote

Quote:
As I understand it, there may only be a single data source per cube. Is
there something I am missing here (I'm not clear about how multiple
partitions would affect this since the personal edition of AS does not
allow it, which would mean that you must have the EE to do development
or learning)? This seems like a killer to me, because it means that
underlying relational (or whatever) data must be designed in
anticipation of pulling the data into a cube, which would be
fundamentally in opposition to the whole point of data mining.

I'm aware that there are ridiculous workarounds such as creating views
for all objects within the target database/datasource, but it takes
something fundamental to MS SQL Server- multiple databases on a single
server that are accessible by each other- and throws it out the window.
What's up with that?




Reply With Quote
  #4  
Old   
SocSecTrainWreck@earthlink.net
 
Posts: n/a

Default Re: Single data source per cube? - 10-17-2005 , 03:18 PM




Jéjé wrote:
Quote:
its always recommended to use views to populate AS objects (cubes,
dimensions)
these views allows you to insure that you provide the right information to
the cube regarding the databases structure changes.

also, in AS, a cube is created using 1 fact table only and the dimensions
must come from the same database.
Without 1 point of access, how can you insure that the product id 1 is the
same in all databases?
This is completely unrelated to whether product id 1 is in one or more
databases. It could just as easily appear in two tables in the same
database with different meanings as in one. But take my word for it,
there are valid reasons for having a column over a single domain appear
in two databases.

Quote:
I don't understand why you found this solution is an opposite of a dataming
solution.
I thought that the objective of datamining was to pull together data
from possibly disparate sources with the objective of mining business
intelligence from it. Without the workaround, there is an assumption
that the data warehouse must be a physically monolithic source. Using
linked servers and views allows a workaround, but also requires
constructing an additional layer of abstraction in the relational
database that supports the cube.

Quote:
A good datamining solution must be done using good & clean data.
Of course. But working back from that conclusion there is no way to
arrive at the necessity for all data to be in a single database, only
that all data wherever it resides should be good and clean.

Quote:
Also you must be able to create test data to validate your datamining model.
If you have multiple databases & multiple sources, creating a good model
starts to become a real problem for your analysts.
I don't agree (I think that in some cases it's actually the opposite),
but even if I accept this premise, it does not follow that all data
_must always_ be in the same database, because there may be reasons for
using multiple databases or sources that outweigh the benefit of a
simpler schema.

Quote:
what you found a limitation is exactly what a datawarewhouse must be:
having 1 point of access for the information with cleansed & easy to use
data, for data mining, olap analysis or reporting. to insure that all the
users access the same information with good performance & shared metadata.
I think that a datawarehouse can be a logical entity and still meet
these requirements.

Quote:
Also, I have a question for you, you have multiple databases on the same
server... does this mean that your databases have the same schema & the same
information?
No.

Quote:
why you don't merge your databases?
Short answer: operational requirements having to do with the frequency
and method by which data is refreshed. I humbly ask you to assume that
I know what I am doing.

Personally, I have always viewed the ability to have multiple databases
on a single server as a particular strength for SQL Server, and I'm
surprised by this limitation within OLAP.



Reply With Quote
  #5  
Old   
SocSecTrainWreck@earthlink.net
 
Posts: n/a

Default Re: Single data source per cube? - 10-17-2005 , 04:23 PM




Darren Gosbell wrote:

Quote:
AS2k5 lifts this restriction and allows you to pull data from multiple
datasources into the one cube.
That's good to know; it didn't seem possible from what I could tell but
I did not spend a lot of time trying it.

Quote:
Although in AS2k, it's not as silly a restriction as it may seem. Most
of the time when you are dealing with multiple datasources, they are not
all consistent and you need to go through some sort of transformation
step after which you load the cleansed data into a central data
warehouse.
Yes, but this does not mean that it must always be in the same
database.

Quote:
In other situations people do not want to query the source
systems directly from AS as this could affect performance.
But what if the source systems are just two different databases
dedicated to OLAP, let alone two different DBMSs?



Reply With Quote
  #6  
Old   
Darren Gosbell
 
Posts: n/a

Default Re: Single data source per cube? - 10-17-2005 , 06:24 PM



Quote:
Yes, but this does not mean that it must always be in the same
database.
But if you have separate data marts, usually this means that you have
separate subject areas and hence separate OLAP cubes. An OLAP database
can have multiple data sources. Its just that (in AS2k) a single cube
has to come from a single relational source.

Quote:
But what if the source systems are just two different databases
dedicated to OLAP, let alone two different DBMSs?
You would only do this if there was a logic division in the data, so you
would then create 2 OLAP cubes or databases. If the two sets of data
logically belong together it's hard for me to see why you need to have
two relational stores.

I admit that it does take extra time and disk space to bring the data
together, but usually there are other benefits and considerations, such
as providing a single, cleansed, consistent view of the data.

I think what you want to do is to delay the convergence of the data
until it gets to the OLAP server. The short answer is that you can't
really do this well in AS2k, in AS2k5 the UDM abstracts the physical
datasources and this is definitely possible.

--
Regards
Darren Gosbell [MCSD]
<dgosbell_at_yahoo_dot_com>
Blog: http://www.geekswithblogs.net/darrengosbell


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.