dbTalk Databases Forums  

Sum of counts

mailing.database.myodbc mailing.database.myodbc


Discuss Sum of counts in the mailing.database.myodbc forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
AT
 
Posts: n/a

Default Sum of counts - 05-09-2006 , 10:55 AM






Here comes a newbie question...

I want to get a total of entries from four tables which all match a
particular id. The result for the id I'm testing (21) should be 233.
In my naivety, I thought something like this would work:

select count(a.id) + count(b.id) + count(c.id) + count(d.guide_id)
from table_a as a, table_b as b, table_c as c, table_d as d
where a.id = 21 and b.id = 21 and c.id = 21 and d.id = 21

....but no - I get about eight and a half million! I tried fiddling
about with joins but got no better luck. In fact if I replace those
+'s with commas I get four equal values of about 2.12 million.

In the end I got the correct result like this:

select
(select count(*) from table_a where id = 21) +
(select count(*) from table_b where id = 21) +
(select count(*) from table_c where id = 21) +
(select count(*) from table_d where id = 21)
as total

Two questions:

1 Is this the best way to do it? If not, what is?

2 This is fine in MySQL 5 (on my development platform), but 3.23 (on
the live platform until the host upgrades us) doesn't support
subqueries, so as an interim measure is there any better way than
doing four separate queries and adding up the total in the PHP script?

--
Cheers... Chris
Highway 57 Web Development -- http://highway57.co.uk/

If Yoda so strong in Force is, why he
words not right order in put can?

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=m...ie.nctu.edu.tw


Reply With Quote
  #2  
Old   
AT
 
Posts: n/a

Default Re: Sum of counts - 05-09-2006 , 12:28 PM







----- Original Message -----
From: "Chris Sansom" <chris (AT) highway57 (DOT) co.uk>
To: "MySQL List" <mysql (AT) lists (DOT) mysql.com>
Sent: Tuesday, May 09, 2006 11:47 AM
Subject: Sum of counts


Quote:
Here comes a newbie question...

I want to get a total of entries from four tables which all match a
particular id. The result for the id I'm testing (21) should be 233. In my
naivety, I thought something like this would work:

select count(a.id) + count(b.id) + count(c.id) + count(d.guide_id)
from table_a as a, table_b as b, table_c as c, table_d as d
where a.id = 21 and b.id = 21 and c.id = 21 and d.id = 21

...but no - I get about eight and a half million! I tried fiddling about
with joins but got no better luck. In fact if I replace those +'s with
commas I get four equal values of about 2.12 million.

In the end I got the correct result like this:

select
(select count(*) from table_a where id = 21) +
(select count(*) from table_b where id = 21) +
(select count(*) from table_c where id = 21) +
(select count(*) from table_d where id = 21)
as total

Two questions:

1 Is this the best way to do it? If not, what is?

2 This is fine in MySQL 5 (on my development platform), but 3.23 (on the
live platform until the host upgrades us) doesn't support subqueries, so
as an interim measure is there any better way than doing four separate
queries and adding up the total in the PHP script?

The reason you are getting so many rows has nothing to do with the way you
are using the count(*) function and adding the different count() results
together. The problem is that you are doing your joins incorrectly. Whenever
you join tables, you need to specify what the tables have in common by
writing "joining predicates"; the number of joining predicates you usually
write is the number of tables being joined minus one. Therefore, since you
are joining four tables, you need three joining predicates: one to join the
first table to the second, one to join the second table to the third, and
one to join the third table to the fourth.

A joining predicate looks like this:

table1.col4 = table2.col3

In other words, there are two column names with an equal sign in the middle.
I think you are attempting to get the same result by saying "a.id. = 21 and
b_id = 21" (etc.) but this is not having the effect that you want. In your
case, I think you need to change the original query to this:

select count(a.id) + count(b.id) + count(c.id) + count(d.guide_id)
from table_a as a, table_b as b, table_c as c, table_d as d
where a.id = b.id
and b.id = c.id
and c.id = d.id
and a.id = 21
and b.id = 21
and c.id = 21
and d.id = 21

This says that:
- a row in table_a is joined to a row in table_b when the 'id' value in
table_a is identical (and non-null) to the 'id' value in table_b
- a row in table_b is joined to a row in table_c when the 'id' value in
table_b is identical (and non-null) to the 'id' value in table_c
- a row in table_c is joined to a row in table_d when the 'id' value in
table_c is identical (and non-null) to the 'id' value in table_d
- the final result should only have rows where the id columns in each of the
four tables contain 21.

The conditions you wrote, namely 'a.id = 21', 'b.id = 21', 'c.id = 21' and
'd.id = 21' are called "local predicates", i.e. conditions that affect only
one table. In your case, you've said that out of all the rows in the result
set after the joins have been done, you only want rows where a.id, b.id,
c.id, and d.id are 21.

Unfortunately, you don't have any joining clauses so MySQL joins EVERY ROW
of the first table to EVERY row of the second table and EVERY ROW of the
second table gets joined to EVERY row of the third table and EVERY row of
the third table gets joined to EVERY row of the fourth table. This is called
a Cartesian product and is usually considered very bad news because it gives
you huge result sets in which most of the rows are joined to rows to which
they shouldn't be joined. I think you wrote this query on the assumption
that your local predicates would ensure that only the correct rows were
joined but, as you can see, that isn't the case.

If you try the query I gave you, you should see that it only joins rows when
the ids are equal; then, the local predicates ('a.id = 21', etc.) ensure
that you only get the joined rows you want, namely the ones that contain
21.)

This is a very common beginner mistake. Unfortunately, the MySQL manual
doesn't yet explain how to do joins very well; this is something that is
badly needed, in my opinion.

--
Rhino




--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.392 / Virus Database: 268.5.5/334 - Release Date: 08/05/2006


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=m...ie.nctu.edu.tw



Reply With Quote
  #3  
Old   
AT
 
Posts: n/a

Default Re: Sum of counts - 05-09-2006 , 01:03 PM



At 13:28 -0400 9/5/06, Rhino wrote:
Quote:
The reason you are getting so many rows has nothing to do with the
way you are using the count(*) function and adding the different
count() results together. The problem is that you are doing your
joins incorrectly... In your case, I think you need to change the
original query to this:

select count(a.id) + count(b.id) + count(c.id) + count(d.id)
from table_a as a, table_b as b, table_c as c, table_d as d
where a.id = b.id
and b.id = c.id
and c.id = d.id
and a.id = 21
and b.id = 21
and c.id = 21
and d.id = 21
Hi Rhino

Many thanks for the very full and frank response, but sadly it didn't
work. I do understand exactly what you said, and I even took it
further, adding in:

and a.id = c.id
and a.id = d.id
and b.id = d.id

....so that every table is thus related to every other one, but I'm
/still/ getting that damned eight and a half million instead of the
233 I expect!

I'm baffled by this, though the version I did with subqueries works
very nicely (and it's simple enough to do four separate queries and
add them together in the script for the older MySQL).

--
Cheers... Chris
Highway 57 Web Development -- http://highway57.co.uk/

Outside of a dog a man's best friend is a book.
Inside of a dog it's too dark to read.
-- Groucho Marx

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=m...ie.nctu.edu.tw



Reply With Quote
  #4  
Old   
AT
 
Posts: n/a

Default Re: Sum of counts - 05-10-2006 , 11:53 AM



Hi Chris, all,


Re-inserting Chris' original question:
Quote:
I want to get a total of entries from four tables which all match a
particular id. The result for the id I'm testing (21) should be 233.
In my naivety, I thought something like this would work:

select count(a.id) + count(b.id) + count(c.id) + count(d.guide_id)
from table_a as a, table_b as b, table_c as c, table_d as d
where a.id = 21 and b.id = 21 and c.id = 21 and d.id = 21

IMO, this is simply no task for a join,
probably not a task for any single SQL statement.

The easiest way is to have four separate "SELECT count(*) FROM table_?"
with the '?' replaced by 'a' .. 'd'.


More explanations below:


Chris Sansom wrote:
Quote:
At 13:28 -0400 9/5/06, Rhino wrote:
The reason you are getting so many rows has nothing to do with the way
you are using the count(*) function and adding the different count()
results together. The problem is that you are doing your joins
incorrectly... In your case, I think you need to change the original
query to this:

select count(a.id) + count(b.id) + count(c.id) + count(d.id)
from table_a as a, table_b as b, table_c as c, table_d as d
where a.id = b.id
and b.id = c.id
and c.id = d.id
and a.id = 21
and b.id = 21
and c.id = 21
and d.id = 21
First, the transitive equality on the 4 "id" columns (first 3
conditions) together with one restriction to 21 (say, on "a.id")
has no different effect than the 4 conditions "= 21",
in mathematical view 3 of these 7 conditions can be dropped.
(Not "any 3", but several different combinations.)

But that is not the cause of the problem - this is the join approach:
Remember that a join does a cartesian product, this is in no way helpful
to the solution of your task!


Let us construct a minimized example: Just two tables, each with three
rows, all having that magic value 21:

Table a Table b
id cnt id cnt
21 1 21 4
21 2 21 5
21 3 21 6

Doing a natural join on the "id" column will yield 9 rows:

a.id a.cnt b.id b.cnt
21 1 21 4
21 1 21 5
21 1 21 6
21 2 21 4
21 2 21 5
21 2 21 6
21 3 21 4
21 3 21 5
21 3 21 6

Summing "a.id" and "b.id" results in 18, where the correct value is 6.


Quote:
Hi Rhino

Many thanks for the very full and frank response, but sadly it didn't
work. I do understand exactly what you said, and I even took it further,
adding in:

and a.id = c.id
and a.id = d.id
and b.id = d.id

...so that every table is thus related to every other one, but I'm
/still/ getting that damned eight and a half million instead of the 233
I expect!
See above -
demanding all 4 columns to be equal to 21
is equivalent to demanding one is 21, and all are equal,
and also equivalent to some other combinations of conditions.

If you have a mathematical education, apply your knowledge of
"transitivity" to the problem.

Quote:
I'm baffled by this, though the version I did with subqueries works very
nicely (and it's simple enough to do four separate queries and add them
together in the script for the older MySQL).
Frankly spoken: This is the way to go!


From your problem description, there seems to be no connection between
the tables that would warrant joining them.

For a very coarse analogy:
If you have separate tables for the teachers and the pupils of a school,
and want to know the number of all peoples going there daily,
you will not join these two tables,
you will rather count them individually and then add these two values.


HTH,
Jörg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=m...ie.nctu.edu.tw



Reply With Quote
  #5  
Old   
AT
 
Posts: n/a

Default Re: Sum of counts - 05-10-2006 , 12:21 PM



Hi Chris, Joerg, and everyone else following this discussion,

Joerg, you are correct; the best way to sum the tables is individually and
then add the sums together with program logic of some kind, such as might be
found in a script or application program or stored procedure.

I'm afraid I jumped in and gave correct but irrelevant information. I saw
that Chris's query lacked joining conditions so I explained why they were
needed and how to write them. Unfortunately, this was premature: I should
have thought about the basic problem more carefully first. Joerg, you are
absolutely right: if one table contains students and another contains
teachers, you don't count the number of people in the school by JOINING the
tables together. You count the people in each table separately and add the
two sums together. I don't know why that didn't come to me when I read
Chris's question but it didn't.

My apologies to all for wasting your time with an inappropriate solution.
I'll try not to do that again!

--
Rhino

----- Original Message -----
From: "Joerg Bruehe" <joerg (AT) mysql (DOT) com>
To: "Chris Sansom" <chris (AT) highway57 (DOT) co.uk>
Cc: "Rhino" <rhino1 (AT) sympatico (DOT) ca>; "MySQL List" <mysql (AT) lists (DOT) mysql.com>
Sent: Wednesday, May 10, 2006 12:04 PM
Subject: Re: Sum of counts


Quote:
Hi Chris, all,


Re-inserting Chris' original question:
| I want to get a total of entries from four tables which all match a
| particular id. The result for the id I'm testing (21) should be 233.
| In my naivety, I thought something like this would work:
|
| select count(a.id) + count(b.id) + count(c.id) + count(d.guide_id)
| from table_a as a, table_b as b, table_c as c, table_d as d
| where a.id = 21 and b.id = 21 and c.id = 21 and d.id = 21


IMO, this is simply no task for a join,
probably not a task for any single SQL statement.

The easiest way is to have four separate "SELECT count(*) FROM table_?"
with the '?' replaced by 'a' .. 'd'.


More explanations below:


Chris Sansom wrote:
At 13:28 -0400 9/5/06, Rhino wrote:
The reason you are getting so many rows has nothing to do with the way
you are using the count(*) function and adding the different count()
results together. The problem is that you are doing your joins
incorrectly... In your case, I think you need to change the original
query to this:

select count(a.id) + count(b.id) + count(c.id) + count(d.id)
from table_a as a, table_b as b, table_c as c, table_d as d
where a.id = b.id
and b.id = c.id
and c.id = d.id
and a.id = 21
and b.id = 21
and c.id = 21
and d.id = 21

First, the transitive equality on the 4 "id" columns (first 3 conditions)
together with one restriction to 21 (say, on "a.id")
has no different effect than the 4 conditions "= 21",
in mathematical view 3 of these 7 conditions can be dropped.
(Not "any 3", but several different combinations.)

But that is not the cause of the problem - this is the join approach:
Remember that a join does a cartesian product, this is in no way helpful
to the solution of your task!


Let us construct a minimized example: Just two tables, each with three
rows, all having that magic value 21:

Table a Table b
id cnt id cnt
21 1 21 4
21 2 21 5
21 3 21 6

Doing a natural join on the "id" column will yield 9 rows:

a.id a.cnt b.id b.cnt
21 1 21 4
21 1 21 5
21 1 21 6
21 2 21 4
21 2 21 5
21 2 21 6
21 3 21 4
21 3 21 5
21 3 21 6

Summing "a.id" and "b.id" results in 18, where the correct value is 6.



Hi Rhino

Many thanks for the very full and frank response, but sadly it didn't
work. I do understand exactly what you said, and I even took it further,
adding in:

and a.id = c.id
and a.id = d.id
and b.id = d.id

...so that every table is thus related to every other one, but I'm
/still/ getting that damned eight and a half million instead of the 233 I
expect!

See above -
demanding all 4 columns to be equal to 21
is equivalent to demanding one is 21, and all are equal,
and also equivalent to some other combinations of conditions.

If you have a mathematical education, apply your knowledge of
"transitivity" to the problem.


I'm baffled by this, though the version I did with subqueries works very
nicely (and it's simple enough to do four separate queries and add them
together in the script for the older MySQL).

Frankly spoken: This is the way to go!


From your problem description, there seems to be no connection between the
tables that would warrant joining them.

For a very coarse analogy:
If you have separate tables for the teachers and the pupils of a school,
and want to know the number of all peoples going there daily,
you will not join these two tables,
you will rather count them individually and then add these two values.


HTH,
Jörg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=r...atico (DOT) ca


--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.392 / Virus Database: 268.5.5/335 - Release Date: 09/05/2006




--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.392 / Virus Database: 268.5.5/335 - Release Date: 09/05/2006


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=m...ie.nctu.edu.tw



Reply With Quote
  #6  
Old   
AT
 
Posts: n/a

Default Re: Sum of counts - 05-10-2006 , 02:03 PM



You could you use UNION to make this all execute in a single query.

On 5/10/06, Rhino <rhino1 (AT) sympatico (DOT) ca> wrote:
Quote:
Hi Chris, Joerg, and everyone else following this discussion,

Joerg, you are correct; the best way to sum the tables is individually an=
d
then add the sums together with program logic of some kind, such as might=
be
found in a script or application program or stored procedure.

I'm afraid I jumped in and gave correct but irrelevant information. I saw
that Chris's query lacked joining conditions so I explained why they were
needed and how to write them. Unfortunately, this was premature: I should
have thought about the basic problem more carefully first. Joerg, you are
absolutely right: if one table contains students and another contains
teachers, you don't count the number of people in the school by JOINING t=
he
tables together. You count the people in each table separately and add th=
e
two sums together. I don't know why that didn't come to me when I read
Chris's question but it didn't.

My apologies to all for wasting your time with an inappropriate solution.
I'll try not to do that again!

--
Rhino

----- Original Message -----
From: "Joerg Bruehe" <joerg (AT) mysql (DOT) com
To: "Chris Sansom" <chris (AT) highway57 (DOT) co.uk
Cc: "Rhino" <rhino1 (AT) sympatico (DOT) ca>; "MySQL List" <mysql (AT) lists (DOT) mysql.com
Sent: Wednesday, May 10, 2006 12:04 PM
Subject: Re: Sum of counts


Hi Chris, all,


Re-inserting Chris' original question:
| I want to get a total of entries from four tables which all match a
| particular id. The result for the id I'm testing (21) should be 233.
| In my naivety, I thought something like this would work:
|
| select count(a.id) + count(b.id) + count(c.id) + count(d.guide_id)
| from table_a as a, table_b as b, table_c as c, table_d as d
| where a.id =3D 21 and b.id =3D 21 and c.id =3D 21 and d.id =3D 21


IMO, this is simply no task for a join,
probably not a task for any single SQL statement.

The easiest way is to have four separate "SELECT count(*) FROM table_?"
with the '?' replaced by 'a' .. 'd'.


More explanations below:


Chris Sansom wrote:
At 13:28 -0400 9/5/06, Rhino wrote:
The reason you are getting so many rows has nothing to do with the wa=
y
you are using the count(*) function and adding the different count()
results together. The problem is that you are doing your joins
incorrectly... In your case, I think you need to change the original
query to this:

select count(a.id) + count(b.id) + count(c.id) + count(d.id)
from table_a as a, table_b as b, table_c as c, table_d as d
where a.id =3D b.id
and b.id =3D c.id
and c.id =3D d.id
and a.id =3D 21
and b.id =3D 21
and c.id =3D 21
and d.id =3D 21

First, the transitive equality on the 4 "id" columns (first 3 condition=
s)
together with one restriction to 21 (say, on "a.id")
has no different effect than the 4 conditions "=3D 21",
in mathematical view 3 of these 7 conditions can be dropped.
(Not "any 3", but several different combinations.)

But that is not the cause of the problem - this is the join approach:
Remember that a join does a cartesian product, this is in no way helpfu=
l
to the solution of your task!


Let us construct a minimized example: Just two tables, each with three
rows, all having that magic value 21:

Table a Table b
id cnt id cnt
21 1 21 4
21 2 21 5
21 3 21 6

Doing a natural join on the "id" column will yield 9 rows:

a.id a.cnt b.id b.cnt
21 1 21 4
21 1 21 5
21 1 21 6
21 2 21 4
21 2 21 5
21 2 21 6
21 3 21 4
21 3 21 5
21 3 21 6

Summing "a.id" and "b.id" results in 18, where the correct value is 6.



Hi Rhino

Many thanks for the very full and frank response, but sadly it didn't
work. I do understand exactly what you said, and I even took it furthe=
r,
adding in:

and a.id =3D c.id
and a.id =3D d.id
and b.id =3D d.id

...so that every table is thus related to every other one, but I'm
/still/ getting that damned eight and a half million instead of the 23=
3 I
expect!

See above -
demanding all 4 columns to be equal to 21
is equivalent to demanding one is 21, and all are equal,
and also equivalent to some other combinations of conditions.

If you have a mathematical education, apply your knowledge of
"transitivity" to the problem.


I'm baffled by this, though the version I did with subqueries works ve=
ry
nicely (and it's simple enough to do four separate queries and add the=
m
together in the script for the older MySQL).

Frankly spoken: This is the way to go!


From your problem description, there seems to be no connection between =
the
tables that would warrant joining them.

For a very coarse analogy:
If you have separate tables for the teachers and the pupils of a school=
,
and want to know the number of all peoples going there daily,
you will not join these two tables,
you will rather count them individually and then add these two values.


HTH,
J=F6rg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Drhino1@sympatic=
o.ca


--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.392 / Virus Database: 268.5.5/335 - Release Date: 09/05/20=
06





--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.392 / Virus Database: 268.5.5/335 - Release Date: 09/05/2006


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3...mail (DOT) com


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3...ie.nctu.edu.tw



Reply With Quote
  #7  
Old   
AT
 
Posts: n/a

Default Re: Sum of counts - 05-11-2006 , 05:01 AM



Hi Rhino, all,


Rhino wrote:
Quote:
Hi Chris, Joerg, and everyone else following this discussion,

Joerg, you are correct; the best way to sum the tables is individually
and then add the sums together with program logic of some kind, [[...]]

I'm afraid I jumped in and gave correct but irrelevant information. I
saw that Chris's query lacked joining conditions so I explained why they
were needed and how to write them. Unfortunately, this was premature: I
should have thought about the basic problem more carefully first. [[...]]

My apologies to all for wasting your time with an inappropriate
solution. I'll try not to do that again!
Well, things like this happen, and none of us is proof against them.

IMO, that is what a mailing list is good for:
- that we get info, tipps, ... from each other,
- that we improve on existing material, reaching better overall results,
- that we can correct each other where we went astray,
- and that we even learn from published errors.

Don't be too angry at yourself, such incidents may also provide help to
others!

Regards
and thanks for _all_ your replies,

Jörg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=m...ie.nctu.edu.tw



Reply With Quote
  #8  
Old   
AT
 
Posts: n/a

Default Re: Sum of counts - 05-11-2006 , 05:18 AM



Hi Adam, all,


Adam Wolff wrote:
Quote:
You could you use UNION to make this all execute in a single query.
Sure you can use UNION to get the data or the (separate) counts from the
four tables, but a plain UNION can not sum over these individual parts.

I have not checked whether you can have a UNION in a subquery, this
would be your only chance to do the OP's task in a single SQL statement.
But IMO, it is better structure to do four separate SELECTs on the
individual tables than to force it all into one.

Regards,
Jörg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=m...ie.nctu.edu.tw



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.