dbTalk Databases Forums  

Data warehouse

comp.databases.mysql comp.databases.mysql


Discuss Data warehouse in the comp.databases.mysql forum.



Reply
 
Thread Tools Display Modes
  #11  
Old   
Ignoramus5263
 
Posts: n/a

Default Re: Data warehouse - 09-09-2011 , 06:01 AM






On 2011-09-09, Axel Schwenke <axel.schwenke (AT) gmx (DOT) de> wrote:
Quote:
Ignoramus3367 <ignoramus3367 (AT) NOSPAM (DOT) 3367.invalid> wrote:
On 2011-09-08, Peter H. Coffin <hellsop (AT) ninehells (DOT) com> wrote:
On Thu, 08 Sep 2011 13:33:51 -0500, Ignoramus3367 wrote:
I am building a data warehouse for data mining purposes.

It will be a RAID 6 array with 12 2TB drives. Server had 48 GB of
memory, 12 intel CPUs, 3Ware 9750 for RAID card.

It will have several tables, with one biggest by far.

My design principles are

1) Use InnoDB with a separate file per table

I'm curious as to your thinking on this one.

Not much, I just do not want this database intermixed with any other.

That's not an answer, really.

InnoDB does not shine too much if your data set is much larger than
the buffer pool because InnoDB funnels *all* I/O through that pool.
Sounds like a joke to me!

Quote:
MyISAM OTOH puts only indexes in memory and is thus better in this
situation. However you certainly don't want to use MyISAM for such
amounts of data because it would take ages to check or repair a table
after a crash.

Exactly.

Quote:
But then MariaDB was designed exactly for that: it's
crashsafe and uses memory in a comparable fashion.
Have you used it, is it reliable and robust etc?

Quote:
There are also third party storage engines for MySQL that are targeted
at data mining:

http://www.mysql.com/why-mysql/data-warehouse.html

I have no hands-on experience with those though.
Well, thanks a lot.

Reply With Quote
  #12  
Old   
Peter H. Coffin
 
Posts: n/a

Default Re: Data warehouse - 09-09-2011 , 10:24 AM






On Thu, 08 Sep 2011 21:43:00 -0500, Ignoramus3367 wrote:
Quote:
On 2011-09-09, Jerry Stuckle <jstucklex (AT) attglobal (DOT) net> wrote:
I do agree with Peter on one thing - you shouldn't be selecting the
RDBMS you're using yet. You don't even have a clear idea of WHAT you're
going to do, much less HOW you're going to do it.

And some of your ideas on operations are questionable - like inserting
into a temp table then into the main table. You very well could be
hurting performance.

Personally, I wouldn't use MySQL for such a large database; it will
work, but there are other products better suited to such things.

I am EXTREMELY open to suggestions here. My experience with large
mysql datasets have not been very encouraging. I am all ears and I
would love to hear more.

What else? Postgres?
What's your budget, realistically? I don't have a good reason to
recommend Postgres over MySQL for this; they're both in the "good enough
for some purposes and the price sure is nice" category. But the one
thing that setting up a data warehouse that is fast, reliable, and easy
(-ish) to manage is *not* is inexpensive. Databases that scale well
and are (nearly) failure-resistant aren't free. Backup solutions and
storage manages aren't free. Middleware software that can help automate
and schedule rebuilding of your warehouse data periodically isn't free
either. But sometimes it's worth it.

--
48. I will treat any beast which I control through magic or technology
with respect and kindness. Thus if the control is ever broken, it
will not immediately come after me for revenge.
--Peter Anspach's list of things to do as an Evil Overlord

Reply With Quote
  #13  
Old   
The Natural Philosopher
 
Posts: n/a

Default Re: Data warehouse - 09-09-2011 , 10:47 AM



Peter H. Coffin wrote:
Quote:
On Thu, 08 Sep 2011 21:43:00 -0500, Ignoramus3367 wrote:
On 2011-09-09, Jerry Stuckle <jstucklex (AT) attglobal (DOT) net> wrote:
I do agree with Peter on one thing - you shouldn't be selecting the
RDBMS you're using yet. You don't even have a clear idea of WHAT you're
going to do, much less HOW you're going to do it.

And some of your ideas on operations are questionable - like inserting
into a temp table then into the main table. You very well could be
hurting performance.

Personally, I wouldn't use MySQL for such a large database; it will
work, but there are other products better suited to such things.
I am EXTREMELY open to suggestions here. My experience with large
mysql datasets have not been very encouraging. I am all ears and I
would love to hear more.

What else? Postgres?

What's your budget, realistically? I don't have a good reason to
recommend Postgres over MySQL for this; they're both in the "good enough
for some purposes and the price sure is nice" category. But the one
thing that setting up a data warehouse that is fast, reliable, and easy
(-ish) to manage is *not* is inexpensive. Databases that scale well
and are (nearly) failure-resistant aren't free. Backup solutions and
storage manages aren't free. Middleware software that can help automate
and schedule rebuilding of your warehouse data periodically isn't free
either. But sometimes it's worth it.

This is a very good point: if it starts t get expensive to work around
issues of 'free' software, you have to ask yourself if it might not be
cheaper to - ahem - buy some consultancy and some product from e.g. Oracle.

Reply With Quote
  #14  
Old   
Geoff Muldoon
 
Posts: n/a

Default Re: Data warehouse - 09-11-2011 , 08:29 PM



ignoramus3367 (AT) NOSPAM (DOT) 3367.invalid says...

Quote:
I am building a data warehouse for data mining purposes.

Could you criticize anything in this approach.
Are you actually building a "real" data warehouse, or simply a reporting
silo/datamart copy?

By "real" I mean ETL transformation from the source to create a
dimensionally-modelled star-schema data representation. Have you read
anything by Bill Inmon, Ralph Kimball or Bert Scalzo?

GM

Reply With Quote
  #15  
Old   
Ignoramus21330
 
Posts: n/a

Default Re: Data warehouse - 09-11-2011 , 08:37 PM



On 2011-09-12, Geoff Muldoon <geoff.muldoon (AT) trap (DOT) gmail.com> wrote:
Quote:
ignoramus3367 (AT) NOSPAM (DOT) 3367.invalid says...

I am building a data warehouse for data mining purposes.

Could you criticize anything in this approach.

Are you actually building a "real" data warehouse, or simply a reporting
silo/datamart copy?

By "real" I mean ETL transformation from the source to create a
dimensionally-modelled star-schema data representation. Have you read
anything by Bill Inmon, Ralph Kimball or Bert Scalzo?
I want to create a platform for data mining, to find valuable data in
a couple of tables. I have been messing with such data for years, but
never with an all-out data warehouse approach. I want to create a
systematic, systematized system. I may not have as much DW education
as I would like to have.

i

Reply With Quote
  #16  
Old   
Bodo
 
Posts: n/a

Default Re: Data warehouse - 09-23-2011 , 05:44 AM



Quote:
I am building a data warehouse for data mining purposes.

It will be a RAID 6 array with 12 2TB drives. Server had 48 GB of
memory, 12 intel CPUs, 3Ware 9750 for RAID card.

Hi,

MySQL is free. Get the 64bit version and try.
http://www.mysql.com/downloads/mysql/
I see no reason to prefer the 5.1 version over the actual 5.5.x
I would even try not to use partitioning.

Oracle is free downloadable. Get the 64bit version and try.
http://www.oracle.com/technetwork/da...urceSiteId=oco
men


PG is free.
http://www.postgresql.org/download/linux

Invest the time and do some nice benchmarks we will love to participate!

Using different filesystems would also be very interesting. Reiserfs? Ext3/4.

Long time ago when i started to be an DBA i was -very- amused to see that insert performance did not
decrease with table size.

It was very interestint too see that delete operations are much more time cunsuming.

Remember: Right indexes will be much more important than everything else!

Do you insert while you do selects?
If yes rollback segemnts (as to speak in Oracle words) and their dimensioning will be important.

kind regards,

Toni

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.