dbTalk Databases Forums  

Re: MySQL not designed for LARGE databases?

comp.databases.mysql comp.databases.mysql


Discuss Re: MySQL not designed for LARGE databases? in the comp.databases.mysql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Luuk
 
Posts: n/a

Default Re: MySQL not designed for LARGE databases? - 06-27-2010 , 12:43 PM






Op 27-6-2010 18:42, Ignoramus12901 schreef:
Quote:
on a 32 bit system, mysql client would die once it reaches the 32 bit
hard limit. On 64 bit, it grows indefinitely.

On a 64 bit machine it will probably crash when it reaches the 64 bit
'hard limit'....

I think this is OK,

Why do you want to do a 'SELECT myfield from BIGTABLE'

is it not enough to do a:
'SELECT myfield from BIGTABLE LIMIT 100;' ?

What do you want to do with this 'millions' fo myfield values?

I'm sure there's a better way than:
'SELECT myfield from BIGTABLE'
just to do the same.....

--
Luuk

Reply With Quote
  #2  
Old   
Axel Schwenke
 
Posts: n/a

Default Re: MySQL not designed for LARGE databases? - 06-27-2010 , 12:53 PM






Ignoramus12901 <ignoramus12901 (AT) NOSPAM (DOT) 12901.invalid> wrote:

[rant snipped]

You've shot yourself in the foot because you never learned to aim.
This is *not* the fault of the gun.

The answer to your (unspoken) questions needs only 4 letters:

RTFM!

Quote:
One example was a big database (that was my personal) that somehow
ended up with a corrupt table. REPAIR TABLE took literaly 13 hours
and still was ongoing. I had to cancel it and move the database to a
SSD, where repair took 30 minutes.
MyISAM table, I suppose. And indeed InnoDB is recommended for big
tables because it's much less susceptible to corruption. As for the
slow repair: this was probably "repair with keycache" because you
did not configure a large enough sort file:

http://dev.mysql.com/doc/refman/5.1/...sort_file_size

Quote:
Now here's what happened with the 90 gig database: I have a big table
and made a query

echo "select myfield from hugetable" | mysql -uuser -ppassword --batch -D mydatabase > field.txt

A seemingly innocent query.
A foolish query. By calling it "innocent" you just show your ignorance.

Quote:
It hang the Ubuntu Hardy server completely. I had to remotely hit a
RESET button.
It did not "hang" the server. It just used all available memory to
buffer the result. When memory was full, your machine started swapping
and became much, much (factor 1000+) slower. It had eventually stopped
with "out of memory". You preferred to reset(!) it before that.

Quote:
So my first question is, is there some option to mysql that prevents
it from growing like crazy.
The option is not "don't grow like crazy" but "don't cache the result"
and you find it here:

http://dev.mysql.com/doc/refman/5.1/...on_mysql_quick

some background:

http://dev.mysql.com/doc/refman/5.1/...re-result.html
vs.
http://dev.mysql.com/doc/refman/5.1/...se-result.html

By default all clients use mysql_store_result() or equivalent in order
to free any locks in the server asap. If you deal with very large
result sets (why would you?), especially result sets larger than your
memory (no really, why would you?) then you must turn that off.


XL

Reply With Quote
  #3  
Old   
Gordon Burditt
 
Posts: n/a

Default Re: MySQL not designed for LARGE databases? - 06-27-2010 , 02:02 PM



Quote:
One example was a big database (that was my personal) that somehow
ended up with a corrupt table. REPAIR TABLE took literaly 13 hours
and still was ongoing. I had to cancel it and move the database to a
SSD, where repair took 30 minutes.
If you kill queries in the middle and restart them, expect corrupted
tables.

Quote:
Now here's what happened with the 90 gig database: I have a big table
and made a query

echo "select myfield from hugetable" | mysql -uuser -ppassword --batch
-D mydatabase > field.txt

A seemingly innocent query.
The C API normally used to fetch query results (mysql_store_result)
fetches the entire result set before giving you the first row.
Among other advantages of this is being able to issue more queries
before you have finished with this result without getting an "out
of sync" error. However, this is very, very bad if the results of
the query are huge. You can fetch row-by-row with mysql_use_result()
and mysql_fetch_row(). The mysql command-line client (which uses
the C API) can be made to use this with the -q option.

The problem is not *large databases*, it's *large query result
sets*.

Quote:
It hang the Ubuntu Hardy server completely. I had to remotely hit a
RESET button.
That's a great way to get corrupted tables.

Reply With Quote
  #4  
Old   
Ignoramus12901
 
Posts: n/a

Default Re: MySQL not designed for LARGE databases? - 06-27-2010 , 02:56 PM



On 2010-06-27, Luuk <luuk (AT) invalid (DOT) lan> wrote:
Quote:
Op 27-6-2010 18:42, Ignoramus12901 schreef:
on a 32 bit system, mysql client would die once it reaches the 32 bit
hard limit. On 64 bit, it grows indefinitely.


On a 64 bit machine it will probably crash when it reaches the 64 bit
'hard limit'....

I think this is OK,

Why do you want to do a 'SELECT myfield from BIGTABLE'

is it not enough to do a:
'SELECT myfield from BIGTABLE LIMIT 100;' ?
no

Quote:
What do you want to do with this 'millions' fo myfield values?

I'm sure there's a better way than:
'SELECT myfield from BIGTABLE'
just to do the same.....

I want to do some stats on the values once I extract them.

i

Reply With Quote
  #5  
Old   
Luuk
 
Posts: n/a

Default Re: MySQL not designed for LARGE databases? - 06-27-2010 , 04:02 PM



Op 27-6-2010 21:56, Ignoramus12901 schreef:
Quote:
On 2010-06-27, Luuk<luuk (AT) invalid (DOT) lan> wrote:
Op 27-6-2010 18:42, Ignoramus12901 schreef:
on a 32 bit system, mysql client would die once it reaches the 32 bit
hard limit. On 64 bit, it grows indefinitely.


On a 64 bit machine it will probably crash when it reaches the 64 bit
'hard limit'....

I think this is OK,

Why do you want to do a 'SELECT myfield from BIGTABLE'

is it not enough to do a:
'SELECT myfield from BIGTABLE LIMIT 100;' ?

no

What do you want to do with this 'millions' fo myfield values?

I'm sure there's a better way than:
'SELECT myfield from BIGTABLE'
just to do the same.....


I want to do some stats on the values once I extract them.

i
'some stats', is very vague

there's some statistical functions in MySQL to

like:
'SELECT SUM(myfield), MIN(myfield), MAX(myfield), AVG(myfield) from
BIGTABLE'

--
Luuk

Reply With Quote
  #6  
Old   
Beauregard T. Shagnasty
 
Posts: n/a

Default Re: MySQL not designed for LARGE databases? - 06-27-2010 , 04:36 PM



Ignoramus12901 wrote:

Quote:
Norman Peelman <npeelman (AT) cfl (DOT) rr.com> wrote:
num_rows * size_of_field = bytes_required

275 million rows, field is int. Not too much.
int? int what? int(11)?
That would be over three billion bytes.
Not too much.

--
-bts
-Four wheels carry the body; two wheels move the soul

Reply With Quote
  #7  
Old   
Lennart Jonsson
 
Posts: n/a

Default Re: MySQL not designed for LARGE databases? - 06-27-2010 , 04:48 PM



On 2010-06-27 17:54, Ignoramus12901 wrote:
[...]
Quote:
Now here's what happened with the 90 gig database: I have a big table
and made a query

echo "select myfield from hugetable" | mysql -uuser -ppassword --batch -D mydatabase > field.txt

A seemingly innocent query.

A query without a predicate on a large table usually means trouble, at
least if it is an OLTP like database. If this is more of a
datawarehouse, why do you need all that data in the application?

What kind of disk subsystem are you using?

Quote:
It hang the Ubuntu Hardy server completely. I had to remotely hit a
RESET button.

Did you investigate disk i/o before shutting it down?

[...]

Quote:
If yes, I will be happy. If not, I have to question whether this whole
mysql was ever really designed to handle big databases.

I haven't used mysql other than for very limited data volumes, so
personally I can not really tell how well it works. From what I've heard
people are using it for fairly large data volumes, so it should be
possible. In my experience from other dbms's, dealing with large volumes
of data requires a fair share of hardware, careful tuning of the dbms,
and having developers asking the right questions.


/Lennart

Reply With Quote
  #8  
Old   
Axel Schwenke
 
Posts: n/a

Default Re: MySQL not designed for LARGE databases? - 06-27-2010 , 04:49 PM



Ignoramus12901 <ignoramus12901 (AT) NOSPAM (DOT) 12901.invalid> wrote:
Quote:
On 2010-06-27, Jerry Stuckle <jstucklex (AT) attglobal (DOT) net> wrote:

My question is why mysql client is buffering the result in memory?
Because that's the intended and documented, but configurable behavior.

Quote:
That's how all RDBMS's work by default. Gather the entire result set,
then return it to the client on request. --batch does not affect buffering.
Indeed. But --quick does. This has now been written twice. Maybe news
article routing is asymmetric. I can see this "discussion" which does
not hit the point at all, but silence on other posts that should be
much more helpful...

Quote:
What I am observing is that mysqld is working, moderately hard, and is
not growing.

At the same time, "mysql" (the client) is just growing in memory
continuously, from the beginning of the query.
Congratulations! You observed the intended (and to say it again)
*documented* behavior of result buffering in the client.

Quote:
If this is a correct conclusion, then I would like to question the
client's behavior.
I would rather question why you don't know such basics and why you
don't just instruct the client to *not* buffer the result.

Quote:
Anyway, I will try tomorrow to write a perl/DBI based client to see if
I can get around this by not using mysql as client. Maybe it will work
and maybe it won't.
If you instruct this client to not buffer the result then it will work.
Just like the 'mysql' command line client. If you continue to ignore
good advice, then it will not work.


XL

Reply With Quote
  #9  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: MySQL not designed for LARGE databases? - 06-27-2010 , 04:56 PM



Beauregard T. Shagnasty wrote:
Quote:
Ignoramus12901 wrote:

Norman Peelman <npeelman (AT) cfl (DOT) rr.com> wrote:
num_rows * size_of_field = bytes_required
275 million rows, field is int. Not too much.

int? int what? int(11)?
That would be over three billion bytes.
Not too much.

No, int(11) is still only 4 or 8 bytes - the 11 only relates to the
display length, not internal storage.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================

Reply With Quote
  #10  
Old   
Ignoramus12901
 
Posts: n/a

Default Re: MySQL not designed for LARGE databases? - 06-27-2010 , 09:58 PM



On 2010-06-27, Axel Schwenke <axel.schwenke (AT) gmx (DOT) de> wrote:
Quote:
Ignoramus12901 <ignoramus12901 (AT) NOSPAM (DOT) 12901.invalid> wrote:
On 2010-06-27, Jerry Stuckle <jstucklex (AT) attglobal (DOT) net> wrote:

My question is why mysql client is buffering the result in memory?

Because that's the intended and documented, but configurable behavior.

That's how all RDBMS's work by default. Gather the entire result set,
then return it to the client on request. --batch does not affect buffering.

Indeed. But --quick does. This has now been written twice. Maybe news
article routing is asymmetric. I can see this "discussion" which does
not hit the point at all, but silence on other posts that should be
much more helpful...
Axel, thanks for mentioning --quick switch. I now believe that I
needed exactly that switch and it is a perfect answer to my question.

Thank you. If anyone else posted a reply, which I have not yet seen,
that also suggest --quick, I am grateful to all who replied.

Quote:
What I am observing is that mysqld is working, moderately hard, and is
not growing.

At the same time, "mysql" (the client) is just growing in memory
continuously, from the beginning of the query.

Congratulations! You observed the intended (and to say it again)
*documented* behavior of result buffering in the client.

If this is a correct conclusion, then I would like to question the
client's behavior.

I would rather question why you don't know such basics and why you
don't just instruct the client to *not* buffer the result.

Anyway, I will try tomorrow to write a perl/DBI based client to see if
I can get around this by not using mysql as client. Maybe it will work
and maybe it won't.

If you instruct this client to not buffer the result then it will work.
Just like the 'mysql' command line client. If you continue to ignore
good advice, then it will not work.
--quick is all I need, I think, and I want to thank you.

i

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.