dbTalk Databases Forums  

Why does this query do this?

comp.databases.oracle.misc comp.databases.oracle.misc


Discuss Why does this query do this? in the comp.databases.oracle.misc forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Mtek
 
Posts: n/a

Default Why does this query do this? - 07-11-2008 , 09:51 PM







Hi,

Ok, I need help with this. Here is my data:

TABLE A
---------------
2468
3690
1357

TABLE B
---------------
3690
1234
NULL
1357
NULL
8900

Ok, I want to get a list of items in TABLE A which do not exist in
TABLE B. Simple right?

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b);

Guess what? It does not work. I get nothing. If I use IN, it
returns things, but if I use NOT IN I get nothing However, if I do
this:

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b WHERE
id IS NOT NULL);

Well, it works fine. Why do I need a NOT NULL on my subquery???

Thanks,

John.

Reply With Quote
  #2  
Old   
Michael D O'Shea
 
Posts: n/a

Default Re: Why does this query do this? - 07-12-2008 , 07:17 AM






On Jul 12, 3:51*am, Mtek <m... (AT) mtekusa (DOT) com> wrote:

Quote:
Hi,

Ok, I need help with this. *Here is my data:

TABLE A
---------------
2468
3690
1357

TABLE B
---------------
3690
1234
NULL
1357
NULL
8900

Ok, I want to get a list of items in TABLE A which do not exist in
TABLE B. *Simple right?

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b);

Guess what? *It does not work. *I get nothing. *If I use IN, it
returns things, but if I use NOT IN I get nothing However, if I do
this:

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b WHERE
id IS NOT NULL);

Well, it works fine. *Why do I need a NOT NULL on my subquery???

Thanks,

John.


Hi John, NULL is unknown, not a value, and treated as such. Thus the
observation in your post.

The following tutorial type article will help you understanding the
difference with examples, http://articles.techrepublic.com.com...1-5319615.html


For your simple query, you might also wish to consider the common set
operations, eg.

SELECT id FROM table_a MINUS SELECT id FROM table_b

hth

Mike


TESSELLA Michael.OShea (AT) tessella (DOT) com
__/__/__/ Tessella Support Services plc
__/__/__/ 3 Vineyard Chambers, ABINGDON, OX14 3PX, England
__/__/__/ Tel: (44)(0)1235-555511 Fax: (44)(0)1235-553301
www.tessella.com Registered in England No. 1466429


Reply With Quote
  #3  
Old   
Michael D O'Shea
 
Posts: n/a

Default Re: Why does this query do this? - 07-12-2008 , 07:17 AM



On Jul 12, 3:51*am, Mtek <m... (AT) mtekusa (DOT) com> wrote:

Quote:
Hi,

Ok, I need help with this. *Here is my data:

TABLE A
---------------
2468
3690
1357

TABLE B
---------------
3690
1234
NULL
1357
NULL
8900

Ok, I want to get a list of items in TABLE A which do not exist in
TABLE B. *Simple right?

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b);

Guess what? *It does not work. *I get nothing. *If I use IN, it
returns things, but if I use NOT IN I get nothing However, if I do
this:

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b WHERE
id IS NOT NULL);

Well, it works fine. *Why do I need a NOT NULL on my subquery???

Thanks,

John.


Hi John, NULL is unknown, not a value, and treated as such. Thus the
observation in your post.

The following tutorial type article will help you understanding the
difference with examples, http://articles.techrepublic.com.com...1-5319615.html


For your simple query, you might also wish to consider the common set
operations, eg.

SELECT id FROM table_a MINUS SELECT id FROM table_b

hth

Mike


TESSELLA Michael.OShea (AT) tessella (DOT) com
__/__/__/ Tessella Support Services plc
__/__/__/ 3 Vineyard Chambers, ABINGDON, OX14 3PX, England
__/__/__/ Tel: (44)(0)1235-555511 Fax: (44)(0)1235-553301
www.tessella.com Registered in England No. 1466429


Reply With Quote
  #4  
Old   
Michael D O'Shea
 
Posts: n/a

Default Re: Why does this query do this? - 07-12-2008 , 07:17 AM



On Jul 12, 3:51*am, Mtek <m... (AT) mtekusa (DOT) com> wrote:

Quote:
Hi,

Ok, I need help with this. *Here is my data:

TABLE A
---------------
2468
3690
1357

TABLE B
---------------
3690
1234
NULL
1357
NULL
8900

Ok, I want to get a list of items in TABLE A which do not exist in
TABLE B. *Simple right?

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b);

Guess what? *It does not work. *I get nothing. *If I use IN, it
returns things, but if I use NOT IN I get nothing However, if I do
this:

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b WHERE
id IS NOT NULL);

Well, it works fine. *Why do I need a NOT NULL on my subquery???

Thanks,

John.


Hi John, NULL is unknown, not a value, and treated as such. Thus the
observation in your post.

The following tutorial type article will help you understanding the
difference with examples, http://articles.techrepublic.com.com...1-5319615.html


For your simple query, you might also wish to consider the common set
operations, eg.

SELECT id FROM table_a MINUS SELECT id FROM table_b

hth

Mike


TESSELLA Michael.OShea (AT) tessella (DOT) com
__/__/__/ Tessella Support Services plc
__/__/__/ 3 Vineyard Chambers, ABINGDON, OX14 3PX, England
__/__/__/ Tel: (44)(0)1235-555511 Fax: (44)(0)1235-553301
www.tessella.com Registered in England No. 1466429


Reply With Quote
  #5  
Old   
Michael D O'Shea
 
Posts: n/a

Default Re: Why does this query do this? - 07-12-2008 , 07:17 AM



On Jul 12, 3:51*am, Mtek <m... (AT) mtekusa (DOT) com> wrote:

Quote:
Hi,

Ok, I need help with this. *Here is my data:

TABLE A
---------------
2468
3690
1357

TABLE B
---------------
3690
1234
NULL
1357
NULL
8900

Ok, I want to get a list of items in TABLE A which do not exist in
TABLE B. *Simple right?

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b);

Guess what? *It does not work. *I get nothing. *If I use IN, it
returns things, but if I use NOT IN I get nothing However, if I do
this:

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b WHERE
id IS NOT NULL);

Well, it works fine. *Why do I need a NOT NULL on my subquery???

Thanks,

John.


Hi John, NULL is unknown, not a value, and treated as such. Thus the
observation in your post.

The following tutorial type article will help you understanding the
difference with examples, http://articles.techrepublic.com.com...1-5319615.html


For your simple query, you might also wish to consider the common set
operations, eg.

SELECT id FROM table_a MINUS SELECT id FROM table_b

hth

Mike


TESSELLA Michael.OShea (AT) tessella (DOT) com
__/__/__/ Tessella Support Services plc
__/__/__/ 3 Vineyard Chambers, ABINGDON, OX14 3PX, England
__/__/__/ Tel: (44)(0)1235-555511 Fax: (44)(0)1235-553301
www.tessella.com Registered in England No. 1466429


Reply With Quote
  #6  
Old   
Charles Hooper
 
Posts: n/a

Default Re: Why does this query do this? - 07-12-2008 , 09:14 AM



On Jul 11, 10:51*pm, Mtek <m... (AT) mtekusa (DOT) com> wrote:
Quote:
Ok, I want to get a list of items in TABLE A which do not exist in
TABLE B. *Simple right?

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b);

Guess what? *It does not work. *I get nothing. *If I use IN, it
returns things, but if I use NOT IN I get nothing However, if I do
this:

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b WHERE
id IS NOT NULL);

Well, it works fine. *Why do I need a NOT NULL on my subquery???

Thanks,

John.
An interesting problem that is possibly answered by looking at the
access predicates in DBMS_Xplan output for the different queries. The
following tests were performed on Oracle 11.1.0.6, which has a null
aware anti-join, so some of the plans may look a little different on
your database version:
First, the set up:
CREATE TABLE T1 (C1 NUMBER(10));
CREATE TABLE T2 (C1 NUMBER(10));

INSERT INTO T1 VALUES (2468);
INSERT INTO T1 VALUES (3690);
INSERT INTO T1 VALUES (1357);

INSERT INTO T2 VALUES (3690);
INSERT INTO T2 VALUES (1234);
INSERT INTO T2 VALUES (NULL);
INSERT INTO T2 VALUES (1357);
INSERT INTO T1 VALUES (NULL);
INSERT INTO T1 VALUES (8900);

COMMIT;

The first SQL statement:
SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2);

no rows selected

The DBMS_Xplan for the above query (right-most columns removed to
prevent line wrapping):
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI NA | | 1 | 1 | 0 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T2 | 1 | 4 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("C1"="C1")

An interesting access predicate. I wonder what happens when T1.C1
containing 2468 is compared with T2.C1 containing NULL, would the
comparison be true, false, or unknown? Would the anti of the
comparison be false, true, or known (respectively)?

Trying the same query again with a /*+ NO_QUERY_TRANSFORMATION */ hint
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 0 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 5 | 1 | 5 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( IS NULL)
3 - filter(LNNVL("C1"<>:B1))

Step 3 has another interesting access predicate, but the access
predicate for step 1 is also interesting. What happens when T1.C1 is
compared to see if it is not equal to a NULL in T2.C1?

With the NOT NULL added in the subquery:
SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2
WHERE C1 IS NOT NULL);

C1
----------
2468
8900

---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI SNA| | 1 | 1 | 2 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("C1"="C1")
3 - filter("C1" IS NOT NULL)


With a /*+ NO_QUERY_TRANSFORMATION */ hint
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 2 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 5 | 1 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( IS NULL)
3 - filter(("C1" IS NOT NULL AND LNNVL("C1"<>:B1)))


A common variation to avoid the common performance hit of a subquery
in older releases of Oracle (8i and older, or where the
NO_QUERY_TRANSFORMATION hint is used):
SELECT
T1.C1
FROM
T1,
T2
WHERE
T1.C1=T2.C1(+)
AND T2.C1 IS NULL;

C1
----------
(NULL)
2468
8900

The row containing the NULL was completely unexpected.

---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI | | 1 | 1 | 3 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."C1"="T2"."C1")
3 - filter("T2"."C1" IS NOT NULL)


With a /*+ NO_QUERY_TRANSFORMATION */ hint
----------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
----------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 3 |
00:00:00.01 |
* 2 | HASH JOIN OUTER | | 1 | 1 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
4 | TABLE ACCESS FULL| T2 | 1 | 4 | 4 |
00:00:00.01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T2"."C1" IS NULL)
2 - access("T1"."C1"="T2"."C1")


The minus syntax suggested by Michael O'Shea seems to get it right,
and might be one way that I would suggest to attack the problem:
SELECT
C1
FROM
T1
MINUS
SELECT
C1
FROM
T2;

C1
----------
2468
8900

----------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
----------------------------------------------------------------------------
Quote:
1 | MINUS | | 1 | | 2 |
00:00:00.01 |
2 | SORT UNIQUE | | 1 | 5 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
4 | SORT UNIQUE | | 1 | 4 | 4 |
00:00:00.01 |
5 | TABLE ACCESS FULL| T2 | 1 | 4 | 4 |
00:00:00.01 |
----------------------------------------------------------------------------

However, the SORT UNIQUE in the plan makes me wonder what happens if
non-unique values are permitted in T1.C1:
INSERT INTO T1 VALUES (1357);
INSERT INTO T1 VALUES (2468);

SELECT
C1
FROM
T1
MINUS
SELECT
C1
FROM
T2;

C1
----------
2468
8900

SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2
WHERE C1 IS NOT NULL);

C1
----------
2468
2468
8900

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.


Reply With Quote
  #7  
Old   
Charles Hooper
 
Posts: n/a

Default Re: Why does this query do this? - 07-12-2008 , 09:14 AM



On Jul 11, 10:51*pm, Mtek <m... (AT) mtekusa (DOT) com> wrote:
Quote:
Ok, I want to get a list of items in TABLE A which do not exist in
TABLE B. *Simple right?

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b);

Guess what? *It does not work. *I get nothing. *If I use IN, it
returns things, but if I use NOT IN I get nothing However, if I do
this:

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b WHERE
id IS NOT NULL);

Well, it works fine. *Why do I need a NOT NULL on my subquery???

Thanks,

John.
An interesting problem that is possibly answered by looking at the
access predicates in DBMS_Xplan output for the different queries. The
following tests were performed on Oracle 11.1.0.6, which has a null
aware anti-join, so some of the plans may look a little different on
your database version:
First, the set up:
CREATE TABLE T1 (C1 NUMBER(10));
CREATE TABLE T2 (C1 NUMBER(10));

INSERT INTO T1 VALUES (2468);
INSERT INTO T1 VALUES (3690);
INSERT INTO T1 VALUES (1357);

INSERT INTO T2 VALUES (3690);
INSERT INTO T2 VALUES (1234);
INSERT INTO T2 VALUES (NULL);
INSERT INTO T2 VALUES (1357);
INSERT INTO T1 VALUES (NULL);
INSERT INTO T1 VALUES (8900);

COMMIT;

The first SQL statement:
SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2);

no rows selected

The DBMS_Xplan for the above query (right-most columns removed to
prevent line wrapping):
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI NA | | 1 | 1 | 0 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T2 | 1 | 4 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("C1"="C1")

An interesting access predicate. I wonder what happens when T1.C1
containing 2468 is compared with T2.C1 containing NULL, would the
comparison be true, false, or unknown? Would the anti of the
comparison be false, true, or known (respectively)?

Trying the same query again with a /*+ NO_QUERY_TRANSFORMATION */ hint
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 0 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 5 | 1 | 5 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( IS NULL)
3 - filter(LNNVL("C1"<>:B1))

Step 3 has another interesting access predicate, but the access
predicate for step 1 is also interesting. What happens when T1.C1 is
compared to see if it is not equal to a NULL in T2.C1?

With the NOT NULL added in the subquery:
SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2
WHERE C1 IS NOT NULL);

C1
----------
2468
8900

---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI SNA| | 1 | 1 | 2 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("C1"="C1")
3 - filter("C1" IS NOT NULL)


With a /*+ NO_QUERY_TRANSFORMATION */ hint
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 2 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 5 | 1 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( IS NULL)
3 - filter(("C1" IS NOT NULL AND LNNVL("C1"<>:B1)))


A common variation to avoid the common performance hit of a subquery
in older releases of Oracle (8i and older, or where the
NO_QUERY_TRANSFORMATION hint is used):
SELECT
T1.C1
FROM
T1,
T2
WHERE
T1.C1=T2.C1(+)
AND T2.C1 IS NULL;

C1
----------
(NULL)
2468
8900

The row containing the NULL was completely unexpected.

---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI | | 1 | 1 | 3 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."C1"="T2"."C1")
3 - filter("T2"."C1" IS NOT NULL)


With a /*+ NO_QUERY_TRANSFORMATION */ hint
----------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
----------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 3 |
00:00:00.01 |
* 2 | HASH JOIN OUTER | | 1 | 1 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
4 | TABLE ACCESS FULL| T2 | 1 | 4 | 4 |
00:00:00.01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T2"."C1" IS NULL)
2 - access("T1"."C1"="T2"."C1")


The minus syntax suggested by Michael O'Shea seems to get it right,
and might be one way that I would suggest to attack the problem:
SELECT
C1
FROM
T1
MINUS
SELECT
C1
FROM
T2;

C1
----------
2468
8900

----------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
----------------------------------------------------------------------------
Quote:
1 | MINUS | | 1 | | 2 |
00:00:00.01 |
2 | SORT UNIQUE | | 1 | 5 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
4 | SORT UNIQUE | | 1 | 4 | 4 |
00:00:00.01 |
5 | TABLE ACCESS FULL| T2 | 1 | 4 | 4 |
00:00:00.01 |
----------------------------------------------------------------------------

However, the SORT UNIQUE in the plan makes me wonder what happens if
non-unique values are permitted in T1.C1:
INSERT INTO T1 VALUES (1357);
INSERT INTO T1 VALUES (2468);

SELECT
C1
FROM
T1
MINUS
SELECT
C1
FROM
T2;

C1
----------
2468
8900

SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2
WHERE C1 IS NOT NULL);

C1
----------
2468
2468
8900

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.


Reply With Quote
  #8  
Old   
Charles Hooper
 
Posts: n/a

Default Re: Why does this query do this? - 07-12-2008 , 09:14 AM



On Jul 11, 10:51*pm, Mtek <m... (AT) mtekusa (DOT) com> wrote:
Quote:
Ok, I want to get a list of items in TABLE A which do not exist in
TABLE B. *Simple right?

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b);

Guess what? *It does not work. *I get nothing. *If I use IN, it
returns things, but if I use NOT IN I get nothing However, if I do
this:

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b WHERE
id IS NOT NULL);

Well, it works fine. *Why do I need a NOT NULL on my subquery???

Thanks,

John.
An interesting problem that is possibly answered by looking at the
access predicates in DBMS_Xplan output for the different queries. The
following tests were performed on Oracle 11.1.0.6, which has a null
aware anti-join, so some of the plans may look a little different on
your database version:
First, the set up:
CREATE TABLE T1 (C1 NUMBER(10));
CREATE TABLE T2 (C1 NUMBER(10));

INSERT INTO T1 VALUES (2468);
INSERT INTO T1 VALUES (3690);
INSERT INTO T1 VALUES (1357);

INSERT INTO T2 VALUES (3690);
INSERT INTO T2 VALUES (1234);
INSERT INTO T2 VALUES (NULL);
INSERT INTO T2 VALUES (1357);
INSERT INTO T1 VALUES (NULL);
INSERT INTO T1 VALUES (8900);

COMMIT;

The first SQL statement:
SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2);

no rows selected

The DBMS_Xplan for the above query (right-most columns removed to
prevent line wrapping):
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI NA | | 1 | 1 | 0 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T2 | 1 | 4 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("C1"="C1")

An interesting access predicate. I wonder what happens when T1.C1
containing 2468 is compared with T2.C1 containing NULL, would the
comparison be true, false, or unknown? Would the anti of the
comparison be false, true, or known (respectively)?

Trying the same query again with a /*+ NO_QUERY_TRANSFORMATION */ hint
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 0 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 5 | 1 | 5 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( IS NULL)
3 - filter(LNNVL("C1"<>:B1))

Step 3 has another interesting access predicate, but the access
predicate for step 1 is also interesting. What happens when T1.C1 is
compared to see if it is not equal to a NULL in T2.C1?

With the NOT NULL added in the subquery:
SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2
WHERE C1 IS NOT NULL);

C1
----------
2468
8900

---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI SNA| | 1 | 1 | 2 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("C1"="C1")
3 - filter("C1" IS NOT NULL)


With a /*+ NO_QUERY_TRANSFORMATION */ hint
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 2 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 5 | 1 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( IS NULL)
3 - filter(("C1" IS NOT NULL AND LNNVL("C1"<>:B1)))


A common variation to avoid the common performance hit of a subquery
in older releases of Oracle (8i and older, or where the
NO_QUERY_TRANSFORMATION hint is used):
SELECT
T1.C1
FROM
T1,
T2
WHERE
T1.C1=T2.C1(+)
AND T2.C1 IS NULL;

C1
----------
(NULL)
2468
8900

The row containing the NULL was completely unexpected.

---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI | | 1 | 1 | 3 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."C1"="T2"."C1")
3 - filter("T2"."C1" IS NOT NULL)


With a /*+ NO_QUERY_TRANSFORMATION */ hint
----------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
----------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 3 |
00:00:00.01 |
* 2 | HASH JOIN OUTER | | 1 | 1 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
4 | TABLE ACCESS FULL| T2 | 1 | 4 | 4 |
00:00:00.01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T2"."C1" IS NULL)
2 - access("T1"."C1"="T2"."C1")


The minus syntax suggested by Michael O'Shea seems to get it right,
and might be one way that I would suggest to attack the problem:
SELECT
C1
FROM
T1
MINUS
SELECT
C1
FROM
T2;

C1
----------
2468
8900

----------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
----------------------------------------------------------------------------
Quote:
1 | MINUS | | 1 | | 2 |
00:00:00.01 |
2 | SORT UNIQUE | | 1 | 5 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
4 | SORT UNIQUE | | 1 | 4 | 4 |
00:00:00.01 |
5 | TABLE ACCESS FULL| T2 | 1 | 4 | 4 |
00:00:00.01 |
----------------------------------------------------------------------------

However, the SORT UNIQUE in the plan makes me wonder what happens if
non-unique values are permitted in T1.C1:
INSERT INTO T1 VALUES (1357);
INSERT INTO T1 VALUES (2468);

SELECT
C1
FROM
T1
MINUS
SELECT
C1
FROM
T2;

C1
----------
2468
8900

SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2
WHERE C1 IS NOT NULL);

C1
----------
2468
2468
8900

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.


Reply With Quote
  #9  
Old   
Charles Hooper
 
Posts: n/a

Default Re: Why does this query do this? - 07-12-2008 , 09:14 AM



On Jul 11, 10:51*pm, Mtek <m... (AT) mtekusa (DOT) com> wrote:
Quote:
Ok, I want to get a list of items in TABLE A which do not exist in
TABLE B. *Simple right?

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b);

Guess what? *It does not work. *I get nothing. *If I use IN, it
returns things, but if I use NOT IN I get nothing However, if I do
this:

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b WHERE
id IS NOT NULL);

Well, it works fine. *Why do I need a NOT NULL on my subquery???

Thanks,

John.
An interesting problem that is possibly answered by looking at the
access predicates in DBMS_Xplan output for the different queries. The
following tests were performed on Oracle 11.1.0.6, which has a null
aware anti-join, so some of the plans may look a little different on
your database version:
First, the set up:
CREATE TABLE T1 (C1 NUMBER(10));
CREATE TABLE T2 (C1 NUMBER(10));

INSERT INTO T1 VALUES (2468);
INSERT INTO T1 VALUES (3690);
INSERT INTO T1 VALUES (1357);

INSERT INTO T2 VALUES (3690);
INSERT INTO T2 VALUES (1234);
INSERT INTO T2 VALUES (NULL);
INSERT INTO T2 VALUES (1357);
INSERT INTO T1 VALUES (NULL);
INSERT INTO T1 VALUES (8900);

COMMIT;

The first SQL statement:
SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2);

no rows selected

The DBMS_Xplan for the above query (right-most columns removed to
prevent line wrapping):
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI NA | | 1 | 1 | 0 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T2 | 1 | 4 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("C1"="C1")

An interesting access predicate. I wonder what happens when T1.C1
containing 2468 is compared with T2.C1 containing NULL, would the
comparison be true, false, or unknown? Would the anti of the
comparison be false, true, or known (respectively)?

Trying the same query again with a /*+ NO_QUERY_TRANSFORMATION */ hint
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 0 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 5 | 1 | 5 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( IS NULL)
3 - filter(LNNVL("C1"<>:B1))

Step 3 has another interesting access predicate, but the access
predicate for step 1 is also interesting. What happens when T1.C1 is
compared to see if it is not equal to a NULL in T2.C1?

With the NOT NULL added in the subquery:
SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2
WHERE C1 IS NOT NULL);

C1
----------
2468
8900

---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI SNA| | 1 | 1 | 2 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("C1"="C1")
3 - filter("C1" IS NOT NULL)


With a /*+ NO_QUERY_TRANSFORMATION */ hint
---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 2 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 5 | 1 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( IS NULL)
3 - filter(("C1" IS NOT NULL AND LNNVL("C1"<>:B1)))


A common variation to avoid the common performance hit of a subquery
in older releases of Oracle (8i and older, or where the
NO_QUERY_TRANSFORMATION hint is used):
SELECT
T1.C1
FROM
T1,
T2
WHERE
T1.C1=T2.C1(+)
AND T2.C1 IS NULL;

C1
----------
(NULL)
2468
8900

The row containing the NULL was completely unexpected.

---------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
---------------------------------------------------------------------------
Quote:
* 1 | HASH JOIN ANTI | | 1 | 1 | 3 |
00:00:00.01 |
2 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
* 3 | TABLE ACCESS FULL| T2 | 1 | 3 | 3 |
00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."C1"="T2"."C1")
3 - filter("T2"."C1" IS NOT NULL)


With a /*+ NO_QUERY_TRANSFORMATION */ hint
----------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
----------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 3 |
00:00:00.01 |
* 2 | HASH JOIN OUTER | | 1 | 1 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
4 | TABLE ACCESS FULL| T2 | 1 | 4 | 4 |
00:00:00.01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T2"."C1" IS NULL)
2 - access("T1"."C1"="T2"."C1")


The minus syntax suggested by Michael O'Shea seems to get it right,
and might be one way that I would suggest to attack the problem:
SELECT
C1
FROM
T1
MINUS
SELECT
C1
FROM
T2;

C1
----------
2468
8900

----------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time |
----------------------------------------------------------------------------
Quote:
1 | MINUS | | 1 | | 2 |
00:00:00.01 |
2 | SORT UNIQUE | | 1 | 5 | 5 |
00:00:00.01 |
3 | TABLE ACCESS FULL| T1 | 1 | 5 | 5 |
00:00:00.01 |
4 | SORT UNIQUE | | 1 | 4 | 4 |
00:00:00.01 |
5 | TABLE ACCESS FULL| T2 | 1 | 4 | 4 |
00:00:00.01 |
----------------------------------------------------------------------------

However, the SORT UNIQUE in the plan makes me wonder what happens if
non-unique values are permitted in T1.C1:
INSERT INTO T1 VALUES (1357);
INSERT INTO T1 VALUES (2468);

SELECT
C1
FROM
T1
MINUS
SELECT
C1
FROM
T2;

C1
----------
2468
8900

SELECT
C1
FROM
T1
WHERE
C1 NOT IN (
SELECT
C1
FROM
T2
WHERE C1 IS NOT NULL);

C1
----------
2468
2468
8900

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.


Reply With Quote
  #10  
Old   
Ken Denny
 
Posts: n/a

Default Re: Why does this query do this? - 07-16-2008 , 07:48 AM



On Jul 11, 10:51*pm, Mtek <m... (AT) mtekusa (DOT) com> wrote:
Quote:
Hi,

Ok, I need help with this. *Here is my data:

TABLE A
---------------
2468
3690
1357

TABLE B
---------------
3690
1234
NULL
1357
NULL
8900

Ok, I want to get a list of items in TABLE A which do not exist in
TABLE B. *Simple right?

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b);

Guess what? *It does not work. *I get nothing. *If I use IN, it
returns things, but if I use NOT IN I get nothing However, if I do
this:

SELECT id FROM table_a WHERE id NOT IN (SELECT id FROM table_b WHERE
id IS NOT NULL);

Well, it works fine. *Why do I need a NOT NULL on my subquery???

Thanks,

John.
Another thing I have found is that "NOT IN" performs like a pig and I
have yet to see a "NOT IN" that couldn't be expressed as a "MINUS" and
get much better performance as a result. So "WHERE id NOT IN (SELECT
id FROM table_b)" can be expressed as
"WHERE id IN (SELECT id FROM table_a MINUS SELECT id FROM table_b)"
which will perform vastly better than using "NOT IN".


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.