dbTalk Databases Forums  

How to get a distinct count of result set of multople table joins?

comp.databases.oracle.misc comp.databases.oracle.misc


Discuss How to get a distinct count of result set of multople table joins? in the comp.databases.oracle.misc forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Chris ( Val )
 
Posts: n/a

Default How to get a distinct count of result set of multople table joins? - 02-26-2008 , 10:26 PM






Hi everyone,

I haven't played with SQL for a while, but the following SQL returns
~700,000 records ( a lot of duplicates ), and I used a UNION with the
same query to reduce the records down to ~250,000. Is there a more
efficient solution, as the UNION takes some time to run? ~3 mins.

<QUERY>
SELECT * FROM schema.tbl_A a, schema.tbl_B b, schema.tbl_C c,
schema.tbl_D d, schema.tbl_E e, schema.tbl_F
f,
schema.tbl_G g, schema.tbl_H h, schema.tbl_I
i,
schema.tbl_J j, schema.tbl_K k, schema.tbl_L
l,
schema.tbl_M m
WHERE a.accountid = b.ID(+)
AND a.custom1id = c.ID(+)
AND a.custom2id = d.ID(+)
AND a.custom3id = e.ID
AND a.custom4id = f.ID(+)
AND a.entityid = g.ID(+)
AND a.icpid = h.ID(+)
AND a.parentid = i.ID(+)
AND a.periodid = j.ID(+)
AND a.scenarioid = k.ID(+)
AND a.valueid = l.ID(+)
AND a.viewid = m.ID(+)
AND d.label IN ('XXX', 'XXX')
AND e.label IN ('XXX', 'XXX')
AND f.label = 'X'
AND k.label = 'XXX'
AND l.label = 'XXX'
AND h.label = 'X'
AND m.label IN ('X', 'X');
UNION
<rest (the same as above) snipped>
</QUERY>

Problem - I am having some trouble getting a DISTINCT COUNT of the
result set from the above. I trieed the
following, but I still get the same (non-distinct) amount returned for
the count.

How do I get a count of the distinct records / result set?
I tried with the UNION as above, as well as the follwoing, but still
get a count which includes the duplicates.

SELECT COUNT(*) FROM
(
SELECT DISTINCT* FROM schema.tbl_A a, schema.tbl_B b, schema.tbl_C
c,
schema.tbl_D d, schema.tbl_E e, schema.tbl_F
f,
schema.tbl_G g, schema.tbl_H h, schema.tbl_I
i,
schema.tbl_J j, schema.tbl_K k, schema.tbl_L
l,
schema.tbl_M m
WHERE a.accountid = b.ID(+)
AND a.custom1id = c.ID(+)
AND a.custom2id = d.ID(+)
AND a.custom3id = e.ID
AND a.custom4id = f.ID(+)
AND a.entityid = g.ID(+)
AND a.icpid = h.ID(+)
AND a.parentid = i.ID(+)
AND a.periodid = j.ID(+)
AND a.scenarioid = k.ID(+)
AND a.valueid = l.ID(+)
AND a.viewid = m.ID(+)
AND d.label IN ('XXX', 'XXX')
AND e.label IN ('XXX', 'XXX')
AND f.label = 'X'
AND k.label = 'XXX'
AND l.label = 'XXX'
AND h.label = 'X'
AND m.label IN ('X', 'X')
);

Thanks in advance,
Chris

Reply With Quote
  #2  
Old   
Ed Prochak
 
Posts: n/a

Default Re: How to get a distinct count of result set of multople tablejoins? - 02-27-2008 , 07:05 AM






On Feb 26, 11:26 pm, "Chris ( Val )" <chris... (AT) gmail (DOT) com> wrote:
Quote:
Hi everyone,

I haven't played with SQL for a while, but the following SQL returns
~700,000 records ( a lot of duplicates ), and I used a UNION with the
same query to reduce the records down to ~250,000. Is there a more
efficient solution, as the UNION takes some time to run? ~3 mins.
[]

Thanks in advance,
Chris
What does the EXPLAIN PLAN show?

Performance issues are always data dependent. IOW, the phrase:
KNOW THY DATA
is one to take to heart.

Come back with the PLAN and we may have some suggestions.
Ed


Reply With Quote
  #3  
Old   
Ed Prochak
 
Posts: n/a

Default Re: How to get a distinct count of result set of multople tablejoins? - 02-27-2008 , 07:05 AM



On Feb 26, 11:26 pm, "Chris ( Val )" <chris... (AT) gmail (DOT) com> wrote:
Quote:
Hi everyone,

I haven't played with SQL for a while, but the following SQL returns
~700,000 records ( a lot of duplicates ), and I used a UNION with the
same query to reduce the records down to ~250,000. Is there a more
efficient solution, as the UNION takes some time to run? ~3 mins.
[]

Thanks in advance,
Chris
What does the EXPLAIN PLAN show?

Performance issues are always data dependent. IOW, the phrase:
KNOW THY DATA
is one to take to heart.

Come back with the PLAN and we may have some suggestions.
Ed


Reply With Quote
  #4  
Old   
Ed Prochak
 
Posts: n/a

Default Re: How to get a distinct count of result set of multople tablejoins? - 02-27-2008 , 07:05 AM



On Feb 26, 11:26 pm, "Chris ( Val )" <chris... (AT) gmail (DOT) com> wrote:
Quote:
Hi everyone,

I haven't played with SQL for a while, but the following SQL returns
~700,000 records ( a lot of duplicates ), and I used a UNION with the
same query to reduce the records down to ~250,000. Is there a more
efficient solution, as the UNION takes some time to run? ~3 mins.
[]

Thanks in advance,
Chris
What does the EXPLAIN PLAN show?

Performance issues are always data dependent. IOW, the phrase:
KNOW THY DATA
is one to take to heart.

Come back with the PLAN and we may have some suggestions.
Ed


Reply With Quote
  #5  
Old   
Ed Prochak
 
Posts: n/a

Default Re: How to get a distinct count of result set of multople tablejoins? - 02-27-2008 , 07:05 AM



On Feb 26, 11:26 pm, "Chris ( Val )" <chris... (AT) gmail (DOT) com> wrote:
Quote:
Hi everyone,

I haven't played with SQL for a while, but the following SQL returns
~700,000 records ( a lot of duplicates ), and I used a UNION with the
same query to reduce the records down to ~250,000. Is there a more
efficient solution, as the UNION takes some time to run? ~3 mins.
[]

Thanks in advance,
Chris
What does the EXPLAIN PLAN show?

Performance issues are always data dependent. IOW, the phrase:
KNOW THY DATA
is one to take to heart.

Come back with the PLAN and we may have some suggestions.
Ed


Reply With Quote
  #6  
Old   
Mark D Powell
 
Posts: n/a

Default Re: How to get a distinct count of result set of multople tablejoins? - 02-27-2008 , 09:02 AM



On Feb 27, 8:05*am, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote:
Quote:
On Feb 26, 11:26 pm, "Chris ( Val )" <chris... (AT) gmail (DOT) com> wrote:

Hi everyone,

I haven't played with SQL for a while, but the following SQL returns
~700,000 records ( a lot of duplicates ), and I used a UNION with the
same query to reduce the records down to ~250,000. Is there a more
efficient solution, as the UNION takes some time to run? ~3 mins.
[]

Thanks in advance,
Chris

What does the EXPLAIN PLAN show?

Performance issues are always data dependent. IOW, the phrase:
KNOW THY DATA
is one to take to heart.

Come back with the PLAN and we may have some suggestions.
* *Ed
Chris, if the query in the select distinct is the same as the first
there should be no duplicates returned.

What is the full version of Oracle?
Does the explain plan Ed asked for show a hash group by?

If so, depending on your version of Oracle, there is a bug where
queries solved with a hash group by can return wrong results; however,
I would expect the distinct to eliminate duplicates still.

Are you sure you just do not have more rows than you expected?

HTH -- Mark D Powell --


Reply With Quote
  #7  
Old   
Mark D Powell
 
Posts: n/a

Default Re: How to get a distinct count of result set of multople tablejoins? - 02-27-2008 , 09:02 AM



On Feb 27, 8:05*am, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote:
Quote:
On Feb 26, 11:26 pm, "Chris ( Val )" <chris... (AT) gmail (DOT) com> wrote:

Hi everyone,

I haven't played with SQL for a while, but the following SQL returns
~700,000 records ( a lot of duplicates ), and I used a UNION with the
same query to reduce the records down to ~250,000. Is there a more
efficient solution, as the UNION takes some time to run? ~3 mins.
[]

Thanks in advance,
Chris

What does the EXPLAIN PLAN show?

Performance issues are always data dependent. IOW, the phrase:
KNOW THY DATA
is one to take to heart.

Come back with the PLAN and we may have some suggestions.
* *Ed
Chris, if the query in the select distinct is the same as the first
there should be no duplicates returned.

What is the full version of Oracle?
Does the explain plan Ed asked for show a hash group by?

If so, depending on your version of Oracle, there is a bug where
queries solved with a hash group by can return wrong results; however,
I would expect the distinct to eliminate duplicates still.

Are you sure you just do not have more rows than you expected?

HTH -- Mark D Powell --


Reply With Quote
  #8  
Old   
Mark D Powell
 
Posts: n/a

Default Re: How to get a distinct count of result set of multople tablejoins? - 02-27-2008 , 09:02 AM



On Feb 27, 8:05*am, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote:
Quote:
On Feb 26, 11:26 pm, "Chris ( Val )" <chris... (AT) gmail (DOT) com> wrote:

Hi everyone,

I haven't played with SQL for a while, but the following SQL returns
~700,000 records ( a lot of duplicates ), and I used a UNION with the
same query to reduce the records down to ~250,000. Is there a more
efficient solution, as the UNION takes some time to run? ~3 mins.
[]

Thanks in advance,
Chris

What does the EXPLAIN PLAN show?

Performance issues are always data dependent. IOW, the phrase:
KNOW THY DATA
is one to take to heart.

Come back with the PLAN and we may have some suggestions.
* *Ed
Chris, if the query in the select distinct is the same as the first
there should be no duplicates returned.

What is the full version of Oracle?
Does the explain plan Ed asked for show a hash group by?

If so, depending on your version of Oracle, there is a bug where
queries solved with a hash group by can return wrong results; however,
I would expect the distinct to eliminate duplicates still.

Are you sure you just do not have more rows than you expected?

HTH -- Mark D Powell --


Reply With Quote
  #9  
Old   
Mark D Powell
 
Posts: n/a

Default Re: How to get a distinct count of result set of multople tablejoins? - 02-27-2008 , 09:02 AM



On Feb 27, 8:05*am, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote:
Quote:
On Feb 26, 11:26 pm, "Chris ( Val )" <chris... (AT) gmail (DOT) com> wrote:

Hi everyone,

I haven't played with SQL for a while, but the following SQL returns
~700,000 records ( a lot of duplicates ), and I used a UNION with the
same query to reduce the records down to ~250,000. Is there a more
efficient solution, as the UNION takes some time to run? ~3 mins.
[]

Thanks in advance,
Chris

What does the EXPLAIN PLAN show?

Performance issues are always data dependent. IOW, the phrase:
KNOW THY DATA
is one to take to heart.

Come back with the PLAN and we may have some suggestions.
* *Ed
Chris, if the query in the select distinct is the same as the first
there should be no duplicates returned.

What is the full version of Oracle?
Does the explain plan Ed asked for show a hash group by?

If so, depending on your version of Oracle, there is a bug where
queries solved with a hash group by can return wrong results; however,
I would expect the distinct to eliminate duplicates still.

Are you sure you just do not have more rows than you expected?

HTH -- Mark D Powell --


Reply With Quote
  #10  
Old   
Charles Hooper
 
Posts: n/a

Default Re: How to get a distinct count of result set of multople tablejoins? - 02-27-2008 , 10:27 AM



On Feb 26, 11:26*pm, "Chris ( Val )" <chris... (AT) gmail (DOT) com> wrote:
Quote:
Hi everyone,

I haven't played with SQL for a while, but the following SQL returns
~700,000 records ( a lot of duplicates ), and I used a UNION with the
same query to reduce the records down to ~250,000. Is there a more
efficient solution, as the UNION takes some time to run? ~3 mins.

QUERY
* SELECT * FROM schema.tbl_A a, schema.tbl_B b, schema.tbl_C c,
* * * * * * * * * * * * schema.tbl_D d, schema.tbl_E e, schema.tbl_F
f,
* * * * * * * * * * * * schema.tbl_G g, schema.tbl_H h, schema.tbl_I
i,
* * * * * * * * * * * * schema.tbl_J j, schema.tbl_K k, schema.tbl_L
l,
* * * * * * * * * * * * schema.tbl_M m
* * * * * * * WHERE *a.accountid = b.ID(+)
* * * * * * * * AND *a.custom1id = c.ID(+)
* * * * * * * * AND *a.custom2id = d.ID(+)
* * * * * * * * AND *a.custom3id = e.ID
* * * * * * * * AND *a.custom4id = f.ID(+)
* * * * * * * * AND *a.entityid = g.ID(+)
* * * * * * * * AND *a.icpid = h.ID(+)
* * * * * * * * AND *a.parentid = i.ID(+)
* * * * * * * * AND *a.periodid = j.ID(+)
* * * * * * * * AND *a.scenarioid = k.ID(+)
* * * * * * * * AND *a.valueid = l.ID(+)
* * * * * * * * AND *a.viewid = m.ID(+)
* * * * * * * * AND *d.label IN ('XXX', 'XXX')
* * * * * * * * AND *e.label IN ('XXX', 'XXX')
* * * * * * * * AND *f.label = 'X'
* * * * * * * * AND *k.label = 'XXX'
* * * * * * * * AND *l.label = 'XXX'
* * * * * * * * AND *h.label = 'X'
* * * * * * * * AND *m.label IN ('X', 'X');
UNION
* *<rest (the same as above) snipped
*</QUERY

Problem - I am having some trouble getting a DISTINCT COUNT of the
result set from the above. I trieed the
following, but I still get the same (non-distinct) amount returned for
the count.

How do I get a count of the distinct records / result set?
I tried with the UNION as above, as well as the follwoing, but still
get a count which includes the duplicates.

SELECT COUNT(*) FROM
*(
* SELECT DISTINCT* FROM schema.tbl_A a, schema.tbl_B b, schema.tbl_C
c,
* * * * * * * * * * * * schema.tbl_D d, schema.tbl_E e, schema.tbl_F
f,
* * * * * * * * * * * * schema.tbl_G g, schema.tbl_H h, schema.tbl_I
i,
* * * * * * * * * * * * schema.tbl_J j, schema.tbl_K k, schema.tbl_L
l,
* * * * * * * * * * * * schema.tbl_M m
* * * * * * * WHERE *a.accountid = b.ID(+)
* * * * * * * * AND *a.custom1id = c.ID(+)
* * * * * * * * AND *a.custom2id = d.ID(+)
* * * * * * * * AND *a.custom3id = e.ID
* * * * * * * * AND *a.custom4id = f.ID(+)
* * * * * * * * AND *a.entityid = g.ID(+)
* * * * * * * * AND *a.icpid = h.ID(+)
* * * * * * * * AND *a.parentid = i.ID(+)
* * * * * * * * AND *a.periodid = j.ID(+)
* * * * * * * * AND *a.scenarioid = k.ID(+)
* * * * * * * * AND *a.valueid = l.ID(+)
* * * * * * * * AND *a.viewid = m.ID(+)
* * * * * * * * AND *d.label IN ('XXX', 'XXX')
* * * * * * * * AND *e.label IN ('XXX', 'XXX')
* * * * * * * * AND *f.label = 'X'
* * * * * * * * AND *k.label = 'XXX'
* * * * * * * * AND *l.label = 'XXX'
* * * * * * * * AND *h.label = 'X'
* * * * * * * * AND *m.label IN ('X', 'X')
*);

Thanks in advance,
Chris
I suspect that Oracle is taking a long time to remove the duplicate
rows due to a combination of the number of rows, the number of columns
(you are specifying to retrieve all columns from the tables), and the
amount of memory available for sorting unique (or hash unique) the
rows to produce a unique list of rows. Are you able to better define
the columns that are interest, and eliminate any columns that are
common between the various tables. I have found in some cases on
Oracle 10.2.0.2 that listing all columns in a GROUP BY allows Oracle
to retrieve a distinct list of rows more quickly than by using the
DISTINCT syntax:
SELECT
COL1,
COL2,
COL3,
COL4,
COL5,
COL6
FROM
T1
GROUP BY
COL1,
COL2,
COL3,
COL4,
COL5,
COL6;

Will sometimes/often execute faster than this:
SELECT DISTINCT
COL1,
COL2,
COL3,
COL4,
COL5,
COL6
FROM
T1;

In the second of your examples, I do not see where you are using the
UNION syntax - are you trying to do something different in that
example? The DISTINCT in the second example only applies to the one
SELECT in the inline view, which could be a problem if you are using a
UNION ALL with additional SELECTs in the inline view.

Explain plan, or better yet a DBMS Xplan showing the predicted and
actual results would be helpful, as requested by Ed and Mark. If the
explain plan shows nothing useful, turn on a 10046 trace at level 8
and see what is happening behind the scenes.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.