dbTalk Databases Forums  

3 Terabyte data warehouse system requirement

comp.databases.olap comp.databases.olap


Discuss 3 Terabyte data warehouse system requirement in the comp.databases.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
d
 
Posts: n/a

Default 3 Terabyte data warehouse system requirement - 02-03-2005 , 08:36 PM






I expect a 3 terabyte warehouse would need no less than
8 processors
32 GIG ram
SAN storage for data, logs, backups

Note: Using SQL Server 2000, W2003 Server (data center?) Analysis Services
on the same box. Hits coming from ProClarity, Excel and Crystal.

What do you think?

d.




Reply With Quote
  #2  
Old   
Stephan Eggermont
 
Posts: n/a

Default Re: 3 Terabyte data warehouse system requirement - 02-04-2005 , 06:27 AM






d <d@d.com> wrote:
Quote:
I expect a 3 terabyte warehouse would need no less than
8 processors
32 GIG ram
SAN storage for data, logs, backups

Note: Using SQL Server 2000, W2003 Server (data center?) Analysis Services
on the same box. Hits coming from ProClarity, Excel and Crystal.

What do you think?
I think you should make the calculations to size your
hardware. And I know you've not provided the information
that's needed to make those calculations.

Stephan


Reply With Quote
  #3  
Old   
William Goedicke
 
Posts: n/a

Default Re: 3 Terabyte data warehouse system requirement - 02-06-2005 , 10:06 AM



Dear d -

Quote:
"d" == d <d@d.com> writes:
d> I expect a 3 terabyte warehouse would need no less than 8
d> processors 32 GIG ram SAN storage for data, logs, backups

d> Note: Using SQL Server 2000, W2003 Server (data center?)
d> Analysis Services on the same box. Hits coming from ProClarity,
d> Excel and Crystal.

d> What do you think?

I wouldn't even consider running something this size on Microsoft
software. The fact that I "wouldn't even consider" it accurately
identifies me as biased against Microsoft products for computing above
small scale but, 20 years experience says their products simply don't
scale well on a single platform and, nowhere near as well as their
claims.

Have you checked on the diminishing marginal returns of additional
CPUs in an SMP box due to context switching in W2003?

You definitely want to get references of a similar or larger
installation.

I notice that W2003 and SQL Server do appear (as high as fifth) on
the Transaction Processing Cost benchmarks at www.tpc.org but, remember
those results are for *highly* tuned (even customized) versions. You
won't have that expertise in-house unless you hire MS's best developers
and they bring the source with them.

HTH.

- Billy

================================================== ==========
William Goedicke goedicke (AT) goedsole (DOT) com
Cell 617-510-7244 http://www.goedsole.com:8080
================================================== ==========

Lest we forget:

It's noon somewhere.

- Moko



Reply With Quote
  #4  
Old   
d
 
Posts: n/a

Default Re: 3 Terabyte data warehouse system requirement - 02-08-2005 , 08:58 PM



actually, MS has a case study where a 2.7 terabyte OLAP db used 12 CPUs and
16 GIG.
I guess you don't have the cahones to make a decision.

what a useless newsgroup.

d.

"Stephan Eggermont" <stephan (AT) stack (DOT) nl> wrote

Quote:
d <d@d.com> wrote:
I expect a 3 terabyte warehouse would need no less than
8 processors
32 GIG ram
SAN storage for data, logs, backups

Note: Using SQL Server 2000, W2003 Server (data center?) Analysis
Services
on the same box. Hits coming from ProClarity, Excel and Crystal.

What do you think?

I think you should make the calculations to size your
hardware. And I know you've not provided the information
that's needed to make those calculations.

Stephan



Reply With Quote
  #5  
Old   
Stephan Eggermont
 
Posts: n/a

Default Re: 3 Terabyte data warehouse system requirement - 02-09-2005 , 06:50 AM



d <d@d.com> wrote:
Quote:
actually, MS has a case study where a 2.7 terabyte OLAP db used 12 CPUs and
16 GIG.
I guess you don't have the cahones to make a decision.

what a useless newsgroup.
No, a perfectly good one. If you don't know enough about building
datawarehouses to see that the MS case study is not relevant because
the number of dimensions is unrealistically low, you don't know
enough to size a datawarehouse.

Stephan


Reply With Quote
  #6  
Old   
Stephan Eggermont
 
Posts: n/a

Default Re: 3 Terabyte data warehouse system requirement - 02-09-2005 , 06:56 AM



d <d@d.com> wrote:
Quote:
I expect a 3 terabyte warehouse would need no less than
....
SAN storage for data, logs, backups
I would expect SANs to be exceptionally unsuited for
datawarehousing.

Stephan


Reply With Quote
  #7  
Old   
bucknuggets@yahoo.com
 
Posts: n/a

Default Re: 3 Terabyte data warehouse system requirement - 02-15-2005 , 01:18 PM



Quote:
What do you think?
hmmm, I think it depends on:
- data model
- database server features - parallelism, partitioning,
clustering, materialized views with query rewrite, etc
- query frequency
- query resource impact
- query performance goals
- data rolloff/archival requirements
- data load frequency, impact
- aggregation frequency, design

There are so many variables, that I almost always prefer to build a
series of prototypes, and target a very scalable solution: that way
you can initially go with cheaper hardware, and lower dbms cost, but
can increase incrementally as usuage increases.

Also, any time you combine the warehouse with the mart you're asking
for trouble. My preference is to keep them separate - the warehouse
becomes the simplest component in the architecture and it solves a ton
of problems. It also means that you can quickly recreate a mart with
new rules over a weekend, can support multiple marts on smaller servers
so that different departments have greater control over their
performance, can have smaller/simpler marts for crystal & excel, maybe
an all-summary mart for some other application, etc, etc.

buck



Reply With Quote
  #8  
Old   
Nigel Pendse
 
Posts: n/a

Default Re: 3 Terabyte data warehouse system requirement - 02-15-2005 , 01:38 PM



<bucknuggets (AT) yahoo (DOT) com> wrote

Quote:
What do you think?

hmmm, I think it depends on:
- data model
- database server features - parallelism, partitioning,
clustering, materialized views with query rewrite, etc
- query frequency
- query resource impact
- query performance goals
- data rolloff/archival requirements
- data load frequency, impact
- aggregation frequency, design

There are so many variables, that I almost always prefer to build a
series of prototypes, and target a very scalable solution: that way
you can initially go with cheaper hardware, and lower dbms cost, but
can increase incrementally as usuage increases.

Also, any time you combine the warehouse with the mart you're asking
for trouble. My preference is to keep them separate - the warehouse
becomes the simplest component in the architecture and it solves a ton
of problems. It also means that you can quickly recreate a mart with
new rules over a weekend, can support multiple marts on smaller
servers so that different departments have greater control over their
performance, can have smaller/simpler marts for crystal & excel, maybe
an all-summary mart for some other application, etc, etc.
As this app uses Analysis Services, most of these questions are not
applicable. I assume all the queries would be against MOLAP cubes which,
in most cases, will respond very quickly.




Reply With Quote
  #9  
Old   
bucknuggets@yahoo.com
 
Posts: n/a

Default Re: 3 Terabyte data warehouse system requirement - 02-15-2005 , 03:22 PM



Nigel wrote:
Quote:
As this app uses Analysis Services, most of these questions are not
applicable. I assume all the queries would be against MOLAP cubes
which,
in most cases, will respond very quickly.
hmm, I've never heard of a 3 tb molap cube, let alone it responding
very quickly.



Reply With Quote
  #10  
Old   
Nigel Pendse
 
Posts: n/a

Default Re: 3 Terabyte data warehouse system requirement - 02-15-2005 , 03:59 PM



<bucknuggets (AT) yahoo (DOT) com> wrote

Quote:
Nigel wrote:
As this app uses Analysis Services, most of these questions are not
applicable. I assume all the queries would be against MOLAP cubes
which, in most cases, will respond very quickly.

hmm, I've never heard of a 3 tb molap cube, let alone it responding
very quickly.
The data would get a lot more compact with MOLAP storage -- typically
one could expect between a four- and ten-fold contraction. Analysis
Services stores data a lot more efficiently than relational databases
do, even if you load all the row-level detail into the cube, which is
unlikely. You can always drill down to detail level when it's needed,
which probably isn't very often.

And I assume the data would be loaded into multiple smaller cubes and
partitions, rather than one humongous hypercube which would be very slow
to build/update.




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.