dbTalk Databases Forums  

Large databases (100GB+)

comp.databases.mysql comp.databases.mysql


Discuss Large databases (100GB+) in the comp.databases.mysql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Ignoramus28131
 
Posts: n/a

Default Large databases (100GB+) - 12-09-2010 , 08:57 PM






I have a personal business need to have a very large SQL database,
which I wanted to host on a large SSD. (properly backed up, of
course).

I would like to know how well MySQL handles datasets this large.

I had mysql databases in tens of GBs and once in a while, I would have
to do repair table. On regular hard drives, this would take literally
forever, and on SSDs under an hour.

I would not be too quick to blame mysql for this, as I tend to think
that it is the Ubuntu's process of updating apparmor that ended up
killing mysql.

Anyway, this repairing is VERY painful and I lose money while repair
table is done.

So. What alternatives (to mysql or some non-default mysql storage
engines) do I have to hold large tables?

My second question is backup. Is there something short of replication
that I can use to incrementally back up large datasets?

i

Reply With Quote
  #2  
Old   
Axel Schwenke
 
Posts: n/a

Default Re: Large databases (100GB+) - 12-10-2010 , 03:28 AM






Ignoramus28131 <ignoramus28131 (AT) NOSPAM (DOT) 28131.invalid> wrote:

Quote:
I have a personal business need to have a very large SQL database,
which I wanted to host on a large SSD. (properly backed up, of
course).
SSD might be good. Or not. If your load requires many random reads,
then SSD shine. Write performance OTOH is rather limited. Especially
for short, streaming writes (log I/O).

Quote:
I would like to know how well MySQL handles datasets this large.
That's not an exact question. Thus there is no exact answer.
Just: 100GB is not large. Real computers have that (and more) RAM.

Quote:
I had mysql databases in tens of GBs and once in a while, I would have
to do repair table.
MyISAM, I guess? And not properly shutdown?
Having to run CHECK/REPAIR table is purely a MyISAM problem. That's
one reason why most "real" MySQL installations use InnoDB nowadays.

Quote:
Anyway, this repairing is VERY painful and I lose money while repair
table is done.
This won't happen with InnoDB. Just make sure you use a recent version
of MySQL to get the fast recovery implementation. I suggest to wait
for 5.5 becoming GA (should be really soon).

Quote:
My second question is backup. Is there something short of replication
that I can use to incrementally back up large datasets?
Incremental backups are overrated. What you really want, is fast
restore of last backup and then binlog roll forward.
With InnoDB even mysqldump is nonlocking. But for bigger databases you
want physical backup anyway. For that there is InnoDB hotbackup (now
integrated in MySQL Enterprise Backup, free to download). And from
Percona there is XtraBackup (open source)


XL

Reply With Quote
  #3  
Old   
The Natural Philosopher
 
Posts: n/a

Default Re: Large databases (100GB+) - 12-10-2010 , 04:56 AM



Axel Schwenke wrote:
Quote:
Ignoramus28131 <ignoramus28131 (AT) NOSPAM (DOT) 28131.invalid> wrote:

I have a personal business need to have a very large SQL database,
which I wanted to host on a large SSD. (properly backed up, of
course).

SSD might be good. Or not. If your load requires many random reads,
then SSD shine. Write performance OTOH is rather limited. Especially
for short, streaming writes (log I/O).

I would like to know how well MySQL handles datasets this large.

That's not an exact question. Thus there is no exact answer.
Just: 100GB is not large. Real computers have that (and more) RAM.

I had mysql databases in tens of GBs and once in a while, I would have
to do repair table.

MyISAM, I guess? And not properly shutdown?
Having to run CHECK/REPAIR table is purely a MyISAM problem. That's
one reason why most "real" MySQL installations use InnoDB nowadays.

Anyway, this repairing is VERY painful and I lose money while repair
table is done.

This won't happen with InnoDB. Just make sure you use a recent version
of MySQL to get the fast recovery implementation. I suggest to wait
for 5.5 becoming GA (should be really soon).

My second question is backup. Is there something short of replication
that I can use to incrementally back up large datasets?

Incremental backups are overrated. What you really want, is fast
restore of last backup and then binlog roll forward.
With InnoDB even mysqldump is nonlocking. But for bigger databases you
want physical backup anyway. For that there is InnoDB hotbackup (now
integrated in MySQL Enterprise Backup, free to download). And from
Percona there is XtraBackup (open source)


XL
I have to vote that best answer of the week.

It addresses every point with exact information.

Its so nice to learn things you didn't know already on Usenet, rather
than have to skip through someone's opinions..

Reply With Quote
  #4  
Old   
Ignoramus13683
 
Posts: n/a

Default Re: Large databases (100GB+) - 12-10-2010 , 09:58 PM



On 2010-12-10, Axel Schwenke <axel.schwenke (AT) gmx (DOT) de> wrote:
Quote:
Ignoramus28131 <ignoramus28131 (AT) NOSPAM (DOT) 28131.invalid> wrote:

I have a personal business need to have a very large SQL database,
which I wanted to host on a large SSD. (properly backed up, of
course).
First of all, thanks for an excellent response.

Quote:
SSD might be good. Or not. If your load requires many random reads,
then SSD shine. Write performance OTOH is rather limited. Especially
for short, streaming writes (log I/O).
I would say, mostly random reads and random writes and selects.

Quote:
I would like to know how well MySQL handles datasets this large.

That's not an exact question. Thus there is no exact answer.
Just: 100GB is not large. Real computers have that (and more) RAM.
OK, it is large in terms of dealing with REPAIR TABLES.

Quote:
I had mysql databases in tens of GBs and once in a while, I would have
to do repair table.

MyISAM, I guess? And not properly shutdown?
Yes, exactly.

Quote:
Having to run CHECK/REPAIR table is purely a MyISAM problem. That's
one reason why most "real" MySQL installations use InnoDB nowadays.
Is there any disadvantage of InnoDB that I might overlook?

Quote:
Anyway, this repairing is VERY painful and I lose money while repair
table is done.

This won't happen with InnoDB. Just make sure you use a recent version
of MySQL to get the fast recovery implementation. I suggest to wait
for 5.5 becoming GA (should be really soon).
How about 5.1, would this work satisfactorily?

Quote:
My second question is backup. Is there something short of replication
that I can use to incrementally back up large datasets?

Incremental backups are overrated. What you really want, is fast
restore of last backup and then binlog roll forward.
OK

Quote:
With InnoDB even mysqldump is nonlocking. But for bigger databases you
want physical backup anyway. For that there is InnoDB hotbackup (now
integrated in MySQL Enterprise Backup, free to download). And from
Percona there is XtraBackup (open source)
This is very nice. It looks like MySQL plus InnoDB is the
winner. Right?

i

Reply With Quote
  #5  
Old   
Luuk
 
Posts: n/a

Default Re: Large databases (100GB+) - 12-11-2010 , 02:26 AM



On 10-12-10 03:57, Ignoramus28131 wrote:
Quote:
I had mysql databases in tens of GBs and once in a while, I would have
to do repair table. On regular hard drives, this would take literally
forever, and on SSDs under an hour.

I would not be too quick to blame mysql for this, as I tend to think
that it is the Ubuntu's process of updating apparmor that ended up
killing mysql.
you could also focus on why you need to do repairs,
and take away thecause of this

--
Luuk

Reply With Quote
  #6  
Old   
John Nagle
 
Posts: n/a

Default Re: Large databases (100GB+) - 12-11-2010 , 11:05 AM



On 12/11/2010 12:26 AM, Luuk wrote:
Quote:
On 10-12-10 03:57, Ignoramus28131 wrote:
I had mysql databases in tens of GBs and once in a while, I would have
to do repair table. On regular hard drives, this would take literally
forever, and on SSDs under an hour.

I would not be too quick to blame mysql for this, as I tend to think
that it is the Ubuntu's process of updating apparmor that ended up
killing mysql.

you could also focus on why you need to do repairs,
and take away thecause of this
Yes. If you don't have ECC memory, you WILL get data corruption.
See "http://www.stanford.edu/class/ee380/Abstracts/101110.html".
The architects of Google point out that if you sort a terabyte twice
without ECC memory, you will get different results.

John Nagle

Reply With Quote
  #7  
Old   
Ignoramus23245
 
Posts: n/a

Default Re: Large databases (100GB+) - 12-11-2010 , 11:37 AM



On 2010-12-11, Luuk <Luuk (AT) invalid (DOT) lan> wrote:
Quote:
On 10-12-10 03:57, Ignoramus28131 wrote:
I had mysql databases in tens of GBs and once in a while, I would have
to do repair table. On regular hard drives, this would take literally
forever, and on SSDs under an hour.

I would not be too quick to blame mysql for this, as I tend to think
that it is the Ubuntu's process of updating apparmor that ended up
killing mysql.

you could also focus on why you need to do repairs,
and take away thecause of this

I suspect that it happened because

1) Apparmor was enabled
2) MySQL database was in a location other than in permitted apparmor
location
3) System updates restore default apparmor configs and kill mysql.


I am not 100% sure about this.

i

Reply With Quote
  #8  
Old   
Luuk
 
Posts: n/a

Default Re: Large databases (100GB+) - 12-11-2010 , 11:51 AM



On 11-12-10 18:37, Ignoramus23245 wrote:
Quote:
On 2010-12-11, Luuk<Luuk (AT) invalid (DOT) lan> wrote:
On 10-12-10 03:57, Ignoramus28131 wrote:
I had mysql databases in tens of GBs and once in a while, I would have
to do repair table. On regular hard drives, this would take literally
forever, and on SSDs under an hour.

I would not be too quick to blame mysql for this, as I tend to think
that it is the Ubuntu's process of updating apparmor that ended up
killing mysql.

you could also focus on why you need to do repairs,
and take away thecause of this


I suspect that it happened because

1) Apparmor was enabled
2) MySQL database was in a location other than in permitted apparmor
location
3) System updates restore default apparmor configs and kill mysql.


I am not 100% sure about this.

i
If apparmor really does this, than you should NOT use apparmor....

--
Luuk

Reply With Quote
  #9  
Old   
Axel Schwenke
 
Posts: n/a

Default Re: Large databases (100GB+) - 12-11-2010 , 01:26 PM



Ignoramus13683 <ignoramus13683 (AT) NOSPAM (DOT) 13683.invalid> wrote:
Quote:
On 2010-12-10, Axel Schwenke <axel.schwenke (AT) gmx (DOT) de> wrote:

Having to run CHECK/REPAIR table is purely a MyISAM problem. That's
one reason why most "real" MySQL installations use InnoDB nowadays.

Is there any disadvantage of InnoDB that I might overlook?
The obvious points are the lacking support for FULLTEXT and SPATIAL
indexes in InnoDB. Also InnoDB tends to allocate more disk space for
the same data (factor 1.5 to 2 compared to MyISAM) - but OTOH InnoDB
suffers much less from fragmentation than MyISAM does.

But the biggest difference is, that InnoDB is a transactional engine
(and that's also the reason why it can recover much faster, i.e.
after a crash). If your application does not use explicite transaction
control, then InnoDB will run in AUTOCOMMIT mode and this means you
will see huge amounts of log I/O.

The preferred solution is to add transaction control to your
application code (frame groups of DML statements in BEGIN/COMMIT).
If you cannot do that, you might need to configure
innodb_flush_log_at_trx_commit=2 to avoid the AUTOCOMMIT
performance trap.

See:

http://dev.mysql.com/doc/refman/5.1/en/commit.html
http://dev.mysql.com/doc/refman/5.1/...db-tuning.html
http://dev.mysql.com/doc/refman/5.1/...arameters.html

Quote:
... make sure you use a recent version
of MySQL to get the fast recovery implementation. I suggest to wait
for 5.5 becoming GA (should be really soon).

How about 5.1, would this work satisfactorily?
Yes, provided you use a recent version. In 5.1 there are two InnoDB
implementations: the builtin (old) and the plugin (new). Make sure
you use the plugin as it has some nice new features (inluding fast
recovery). In 5.5 the plugin code is used as builtin InnoDB.

See:

http://dev.mysql.com/doc/refman/5.1/...in-innodb.html
http://blogs.innodb.com/wp/2010/04/i...ance-recovery/

Quote:
This is very nice. It looks like MySQL plus InnoDB is the
winner. Right?
InnoDB is the superior engine for almost all use cases (there are
edge cases though). In 5.5 InnoDB becomes the default engine.
From a design point of view, InnoDB ist very similar to Oracle DB.
Only better (says Heikki


XL

Reply With Quote
  #10  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: Large databases (100GB+) - 12-11-2010 , 01:54 PM



On 12/11/2010 12:05 PM, John Nagle wrote:
Quote:
On 12/11/2010 12:26 AM, Luuk wrote:
On 10-12-10 03:57, Ignoramus28131 wrote:
I had mysql databases in tens of GBs and once in a while, I would have
to do repair table. On regular hard drives, this would take literally
forever, and on SSDs under an hour.

I would not be too quick to blame mysql for this, as I tend to think
that it is the Ubuntu's process of updating apparmor that ended up
killing mysql.

you could also focus on why you need to do repairs,
and take away thecause of this

Yes. If you don't have ECC memory, you WILL get data corruption.
See "http://www.stanford.edu/class/ee380/Abstracts/101110.html".
The architects of Google point out that if you sort a terabyte twice
without ECC memory, you will get different results.

John Nagle

If that were the real case, you never could trust data on any computer,
and computers would have many more outages due to bits being flipped or
dropped in machine code and system data. And some OS's run for months
or even years with no problems and no reboots.

Not only that, but "one terabyte of data" is not a sorting measure. Are
you talking 2 records, each 500GB? Or are you talking 100 billion
records, each 10 bytes? Or somewhere in between.

But then, of course, that article is talking about memory from 1997-1999
- today's memory is much denser and more reliable in comparison.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.