dbTalk Databases Forums  

joins in mysql

comp.databases.mysql comp.databases.mysql


Discuss joins in mysql in the comp.databases.mysql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Marcus
 
Posts: n/a

Default joins in mysql - 08-05-2006 , 01:00 PM






Hello,

I've read numerous articles on the web advocating avoiding joins in
MySQL whenever possible for performance reasons. I was curious as to
just how slow a simple equi-join would be, so I ran some tests using the
benchmark code from this site:

http://www.developerfusion.co.uk/show/2058/

I made 2 tables and filled each with 3 million records and then
performed some joins and got these results with MySQL 4.1.12:

CREATE TABLE table1 (
id int(10) unsigned NOT NULL auto_increment,
field1 char(50) default NULL,
field2 char(50) default NULL,
PRIMARY KEY (id)
)

CREATE TABLE table2 (
id int(10) unsigned NOT NULL auto_increment,
field1 char(50) default NULL,
field2 char(50) default NULL,
PRIMARY KEY (id)
)

******************************

SELECT *
FROM table1 a, table2 b
WHERE a.id = b.id AND a.id = 1

OUTPUT:
num rows: 1
This page was created in 0.00327396392822 seconds

******************************

SELECT *
FROM table1 a, table2 b
WHERE a.id = b.id AND a.id >= 1 AND a.id <= 100

OUTPUT:
num rows: 100
This page was created in 0.00424695014954 seconds

******************************

SELECT *
FROM table1 a, table2 b
WHERE a.id = b.id AND a.id >= 1 AND a.id <= 10000

OUTPUT:
num rows: 10000
This page was created in 0.116240024567 seconds

******************************

SELECT *
FROM table1 a, table2 b
WHERE a.id = b.id AND a.id >= 1 AND a.id <= 100000

OUTPUT:
num rows: 100000
This page was created in 1.99258494377 seconds

******************************

SELECT *
FROM table1 a, table2 b
WHERE a.id = b.id AND a.id >= 1 AND a.id <= 1000000

OUTPUT:
num rows: 1000000
This page was created in 19.5196549892 seconds

******************************

That being said, is there any real merit to the idea of avoiding joins
whenever possible? 3 million rows per table seems like a fairly decent
size to me, and returning 10,000 rows from that join takes just a bit
over 1/10 of a second. Could someone please let me know if I am
oversimplifying all of this and am missing something important, because
otherwise these numbers seem very reasonable to me. Thanks in advance.

Reply With Quote
  #2  
Old   
Bill Karwin
 
Posts: n/a

Default Re: joins in mysql - 08-05-2006 , 03:13 PM






Marcus wrote:
Quote:
I've read numerous articles on the web advocating avoiding joins in
MySQL whenever possible for performance reasons.
That's nonsense.

By the same reasoning, we should avoid "while" loops in conventional
programming languages -- again, for performance reasons. I mean, they
_must_ be bad for performance! They run the same lines of code over and
over again! ;-)

It's true that we should avoid _unnecessary_ joins in SQL, just as we
should avoid unnecessary code in other languages. There are smart ways
to write code, and there are naive ways to write code. You know, like
moving loop-invariant code outside the loop, etc.

Likewise, it's smart to use join expressions that can be optimized and
that can use indexes, and it makes joins even between huge tables return
results very quickly.

The articles you read must be making a more subtle point. No one
writing about SQL should make such baseless, general statements.

Regards,
Bill K.


Reply With Quote
  #3  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: joins in mysql - 08-05-2006 , 03:16 PM



Marcus wrote:
Quote:
Hello,

I've read numerous articles on the web advocating avoiding joins in
MySQL whenever possible for performance reasons. I was curious as to
just how slow a simple equi-join would be, so I ran some tests using the
benchmark code from this site:

http://www.developerfusion.co.uk/show/2058/

First of all, don't believe everything you read on the internet! There
are a lot of people who have no real knowledge of what's going on
espousing all kinds of things.

To answer your question, this goes back to basics of DB theory and
normalization. There are 5 levels of normalization (in theory).

As you go to higher normalization levels, you get higher diversity and
less duplication of data. But at the same time it means you will
typically have more tables, requiring more joins. This means that
performance will often times be slower, requiring more CPU cycles (of
course, there are exceptions to this - but those are typically rare).

For instance, in first normal form you may have everything in one table,
requiring no joins. However, you would also have a massive duplication
of data. Take a bank, for instance. If a customer has a savings
account and a checking account, the customer's name, address, etc. would
have to be saved in each account's record.

Going to second normal form, you would have two tables - a customer
table, an account table. The account table would contain the customer's
id, requiring a single join.

But what if you have a joint account (two customers)? This doesn't work
as well. Yes, you could stay in second normal form and have multiple
entries in the account table for customer ids, but this has it's limits.
(There are other ways also, but they each have their advantages and
disadvantages)

Going to third normal form adds a third table containing a customer id
and an account id, linking the other two tables. This would require two
joins to get customer information for an account (or vice versa), making
it slower but more flexible.

The bottom line is - proper database design is not always easy and has
to consider the design of the entire database as well as the access
required by the program(s) using it. You can't say joins by themselves
are good or bad. They need to be taken in the context of the entire system.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.