dbTalk Databases Forums  

Loading time of a DW

comp.databases.olap comp.databases.olap


Discuss Loading time of a DW in the comp.databases.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Sandro
 
Posts: n/a

Default Loading time of a DW - 03-10-2008 , 01:45 AM






Hello,

I have a question regarding loading time of a data warehouse. Do you
know the standard loading time of a data warehouse (e.g. how many GB a
day)? Out of the technical components, is it depending on the system,
the way the DW is structured, etc.?

Thanks in advance for your responses.

Reply With Quote
  #2  
Old   
Brad
 
Posts: n/a

Default Re: Loading time of a DW - 03-13-2008 , 11:40 AM






On Mar 10, 2:45*am, Sandro <sandro.sai... (AT) gmail (DOT) com> wrote:
Quote:
Hello,

I have a question regarding loading time of a data warehouse. Do you
know the standard loading time of a data warehouse (e.g. how many GB a
day)? Out of the technical components, is it depending on the system,
the way the DW is structured, etc.?

Thanks in advance for your responses.
Very much depends on the structure, hardware configuration, especially
drive isolation, and business rules being applied. We typically load
3 million rows into a star schema within minutes.
Brad


Reply With Quote
  #3  
Old   
Brad
 
Posts: n/a

Default Re: Loading time of a DW - 03-13-2008 , 11:40 AM



On Mar 10, 2:45*am, Sandro <sandro.sai... (AT) gmail (DOT) com> wrote:
Quote:
Hello,

I have a question regarding loading time of a data warehouse. Do you
know the standard loading time of a data warehouse (e.g. how many GB a
day)? Out of the technical components, is it depending on the system,
the way the DW is structured, etc.?

Thanks in advance for your responses.
Very much depends on the structure, hardware configuration, especially
drive isolation, and business rules being applied. We typically load
3 million rows into a star schema within minutes.
Brad


Reply With Quote
  #4  
Old   
Brad
 
Posts: n/a

Default Re: Loading time of a DW - 03-13-2008 , 11:40 AM



On Mar 10, 2:45*am, Sandro <sandro.sai... (AT) gmail (DOT) com> wrote:
Quote:
Hello,

I have a question regarding loading time of a data warehouse. Do you
know the standard loading time of a data warehouse (e.g. how many GB a
day)? Out of the technical components, is it depending on the system,
the way the DW is structured, etc.?

Thanks in advance for your responses.
Very much depends on the structure, hardware configuration, especially
drive isolation, and business rules being applied. We typically load
3 million rows into a star schema within minutes.
Brad


Reply With Quote
  #5  
Old   
Brad
 
Posts: n/a

Default Re: Loading time of a DW - 03-13-2008 , 11:40 AM



On Mar 10, 2:45*am, Sandro <sandro.sai... (AT) gmail (DOT) com> wrote:
Quote:
Hello,

I have a question regarding loading time of a data warehouse. Do you
know the standard loading time of a data warehouse (e.g. how many GB a
day)? Out of the technical components, is it depending on the system,
the way the DW is structured, etc.?

Thanks in advance for your responses.
Very much depends on the structure, hardware configuration, especially
drive isolation, and business rules being applied. We typically load
3 million rows into a star schema within minutes.
Brad


Reply With Quote
  #6  
Old   
klaus
 
Posts: n/a

Default Re: Loading time of a DW - 03-17-2008 , 02:50 PM



Sandro schrieb:
Quote:
Hello,

I have a question regarding loading time of a data warehouse. Do you
know the standard loading time of a data warehouse (e.g. how many GB a
day)? Out of the technical components, is it depending on the system,
the way the DW is structured, etc.?

Thanks in advance for your responses.
There is no generic answer in that.
Assuming all extraction and transformation processes are done then
it depends mainly on the underlying databases system and of course I/O
and hardware powers.

Conventional row oriented databases are meant for OLTP applications are
mainly I/O bound, and evend with multiple CPU you won't get more than a
few million rows or worse in an hour no matter the schema, and not to
forget there is a lot of reorganization and maintenance in indexes.
To me load is complete when you actually can USE the data.
Performance degrades with number of rows..

Getting it a bit better you gonna take a MPP Multiparallelprocessing
(Teradata /Netezza) you gonna add a lot of hardware nodes and
parallelise you will probably be well in the millions per hour, but
realtime warehouse -don't think so - except if you invest a lot and also
in a power plant. (Not quite green)
Performance is acceptable but when you reach the limit of the configured
system and can't parallelize it costs an awful investment to increase
capacity.

If you want to go to load millions whithin a minute - (Eg stock ticks
, quotes etc. and truely realtime you should look for a vector oriented
database such as Sybase IQ, paraccel or likely, but still you need good
CPU power. I ve seen 948 Megarows (40 cols) in somewhat 45 Minutes on
2 ibm power quadcores (16 GB Mainmem) linux on Sybase IQ for
purchasing basket / sales slip analysis. (kind o' disks - I dont know)
We were receiving sales slips from several branches.
Performance is acceptable - but do not use these databases for OLTP-like
applications... updating records could become nasty if in particular
if done by many.


Still whatever you do from daily to near realtime, influencing factors
are all: type of database, schema and hardware (CPU, RAM, Controllers
and Diskset) The lesson I 've learned by this: There is no such thing as
an universal database and also more CPU's are not not always helpful.

So you re between one million and several billions rows a day depending
on platform and what you like to do.

My focus is mainly near realtime data marts (e.g. like mobile phone
network optimization, fraud detection and quality control) for data
mining access.
Regards
Klaus

klaus roehler, Greven, DE
klaus-roehlerATversatel.de





Reply With Quote
  #7  
Old   
klaus
 
Posts: n/a

Default Re: Loading time of a DW - 03-17-2008 , 02:50 PM



Sandro schrieb:
Quote:
Hello,

I have a question regarding loading time of a data warehouse. Do you
know the standard loading time of a data warehouse (e.g. how many GB a
day)? Out of the technical components, is it depending on the system,
the way the DW is structured, etc.?

Thanks in advance for your responses.
There is no generic answer in that.
Assuming all extraction and transformation processes are done then
it depends mainly on the underlying databases system and of course I/O
and hardware powers.

Conventional row oriented databases are meant for OLTP applications are
mainly I/O bound, and evend with multiple CPU you won't get more than a
few million rows or worse in an hour no matter the schema, and not to
forget there is a lot of reorganization and maintenance in indexes.
To me load is complete when you actually can USE the data.
Performance degrades with number of rows..

Getting it a bit better you gonna take a MPP Multiparallelprocessing
(Teradata /Netezza) you gonna add a lot of hardware nodes and
parallelise you will probably be well in the millions per hour, but
realtime warehouse -don't think so - except if you invest a lot and also
in a power plant. (Not quite green)
Performance is acceptable but when you reach the limit of the configured
system and can't parallelize it costs an awful investment to increase
capacity.

If you want to go to load millions whithin a minute - (Eg stock ticks
, quotes etc. and truely realtime you should look for a vector oriented
database such as Sybase IQ, paraccel or likely, but still you need good
CPU power. I ve seen 948 Megarows (40 cols) in somewhat 45 Minutes on
2 ibm power quadcores (16 GB Mainmem) linux on Sybase IQ for
purchasing basket / sales slip analysis. (kind o' disks - I dont know)
We were receiving sales slips from several branches.
Performance is acceptable - but do not use these databases for OLTP-like
applications... updating records could become nasty if in particular
if done by many.


Still whatever you do from daily to near realtime, influencing factors
are all: type of database, schema and hardware (CPU, RAM, Controllers
and Diskset) The lesson I 've learned by this: There is no such thing as
an universal database and also more CPU's are not not always helpful.

So you re between one million and several billions rows a day depending
on platform and what you like to do.

My focus is mainly near realtime data marts (e.g. like mobile phone
network optimization, fraud detection and quality control) for data
mining access.
Regards
Klaus

klaus roehler, Greven, DE
klaus-roehlerATversatel.de





Reply With Quote
  #8  
Old   
klaus
 
Posts: n/a

Default Re: Loading time of a DW - 03-17-2008 , 02:50 PM



Sandro schrieb:
Quote:
Hello,

I have a question regarding loading time of a data warehouse. Do you
know the standard loading time of a data warehouse (e.g. how many GB a
day)? Out of the technical components, is it depending on the system,
the way the DW is structured, etc.?

Thanks in advance for your responses.
There is no generic answer in that.
Assuming all extraction and transformation processes are done then
it depends mainly on the underlying databases system and of course I/O
and hardware powers.

Conventional row oriented databases are meant for OLTP applications are
mainly I/O bound, and evend with multiple CPU you won't get more than a
few million rows or worse in an hour no matter the schema, and not to
forget there is a lot of reorganization and maintenance in indexes.
To me load is complete when you actually can USE the data.
Performance degrades with number of rows..

Getting it a bit better you gonna take a MPP Multiparallelprocessing
(Teradata /Netezza) you gonna add a lot of hardware nodes and
parallelise you will probably be well in the millions per hour, but
realtime warehouse -don't think so - except if you invest a lot and also
in a power plant. (Not quite green)
Performance is acceptable but when you reach the limit of the configured
system and can't parallelize it costs an awful investment to increase
capacity.

If you want to go to load millions whithin a minute - (Eg stock ticks
, quotes etc. and truely realtime you should look for a vector oriented
database such as Sybase IQ, paraccel or likely, but still you need good
CPU power. I ve seen 948 Megarows (40 cols) in somewhat 45 Minutes on
2 ibm power quadcores (16 GB Mainmem) linux on Sybase IQ for
purchasing basket / sales slip analysis. (kind o' disks - I dont know)
We were receiving sales slips from several branches.
Performance is acceptable - but do not use these databases for OLTP-like
applications... updating records could become nasty if in particular
if done by many.


Still whatever you do from daily to near realtime, influencing factors
are all: type of database, schema and hardware (CPU, RAM, Controllers
and Diskset) The lesson I 've learned by this: There is no such thing as
an universal database and also more CPU's are not not always helpful.

So you re between one million and several billions rows a day depending
on platform and what you like to do.

My focus is mainly near realtime data marts (e.g. like mobile phone
network optimization, fraud detection and quality control) for data
mining access.
Regards
Klaus

klaus roehler, Greven, DE
klaus-roehlerATversatel.de





Reply With Quote
  #9  
Old   
klaus
 
Posts: n/a

Default Re: Loading time of a DW - 03-17-2008 , 02:50 PM



Sandro schrieb:
Quote:
Hello,

I have a question regarding loading time of a data warehouse. Do you
know the standard loading time of a data warehouse (e.g. how many GB a
day)? Out of the technical components, is it depending on the system,
the way the DW is structured, etc.?

Thanks in advance for your responses.
There is no generic answer in that.
Assuming all extraction and transformation processes are done then
it depends mainly on the underlying databases system and of course I/O
and hardware powers.

Conventional row oriented databases are meant for OLTP applications are
mainly I/O bound, and evend with multiple CPU you won't get more than a
few million rows or worse in an hour no matter the schema, and not to
forget there is a lot of reorganization and maintenance in indexes.
To me load is complete when you actually can USE the data.
Performance degrades with number of rows..

Getting it a bit better you gonna take a MPP Multiparallelprocessing
(Teradata /Netezza) you gonna add a lot of hardware nodes and
parallelise you will probably be well in the millions per hour, but
realtime warehouse -don't think so - except if you invest a lot and also
in a power plant. (Not quite green)
Performance is acceptable but when you reach the limit of the configured
system and can't parallelize it costs an awful investment to increase
capacity.

If you want to go to load millions whithin a minute - (Eg stock ticks
, quotes etc. and truely realtime you should look for a vector oriented
database such as Sybase IQ, paraccel or likely, but still you need good
CPU power. I ve seen 948 Megarows (40 cols) in somewhat 45 Minutes on
2 ibm power quadcores (16 GB Mainmem) linux on Sybase IQ for
purchasing basket / sales slip analysis. (kind o' disks - I dont know)
We were receiving sales slips from several branches.
Performance is acceptable - but do not use these databases for OLTP-like
applications... updating records could become nasty if in particular
if done by many.


Still whatever you do from daily to near realtime, influencing factors
are all: type of database, schema and hardware (CPU, RAM, Controllers
and Diskset) The lesson I 've learned by this: There is no such thing as
an universal database and also more CPU's are not not always helpful.

So you re between one million and several billions rows a day depending
on platform and what you like to do.

My focus is mainly near realtime data marts (e.g. like mobile phone
network optimization, fraud detection and quality control) for data
mining access.
Regards
Klaus

klaus roehler, Greven, DE
klaus-roehlerATversatel.de





Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.