On Jan 13, 3:34*pm, nickli2... (AT) gmail (DOT) com wrote:
Quote:
Hi,
* I have a question on how composite index should be used. As an
example, my queries often select two leading columns, column_1 and
column_2 of a table. Should I create a composite index using column_1
and column_2, *or should I create two separated indexes for column_1
and column_2? What are the differences between these two approaches
and performance implications?
* Thanks in advance.
* Nick |
I don't think that you have provided enough information for a
reasonably accurate answer to be provided. There are too many
possible exceptions to any type of broad indexing rule for this
situation.
Assume the following table exists:
CREATE TABLE T1(
COLUMN_1 VARCHAR2(10) NOT NULL,
COLUMN_2 VARCHAR2(10) NOT NULL,
COLUMN_3 VARCHAR2(20),
COLUMN_4 VARCHAR2(20),
COLUMN_5 VARCHAR2(20),
COLUMN_6 VARCHAR2(20),
COLUMN_7 VARCHAR2(20),
COLUMN_8 VARCHAR2(20),
COLUMN_9 VARCHAR2(20),
COLUMN_10 VARCHAR2(20));
If your queries typically have a WHERE clause like this:
WHERE
COLUMN_1='my_val'
AND COLUMN_2='my_val2'
It probably makes sense to create an index on (COLUMN_1,COLUMN_2).
The same index could potentially satisfy a WHERE clause that specifies
only one of the two columns by using an index skip scan.
---
If your queries typically have a WHERE clause like this:
WHERE
COLUMN_1='my_val'
It probably does not make sense to include COLUMN_2 in the index, as
doing so will likely increase the clustering factor of the index,
making it appear in cost determination to be a more expensive data
access method. Of course, if you are only selecting COLUMN_1 and
COLUMN_2, Oracle may only need to visit the index to retrieve the
rows, and not visit the table.
The best answer is to test. One such test on Oracle 10.2.0.3:
INSERT INTO
T1
SELECT
DBMS_RANDOM.STRING('A',10),
DBMS_RANDOM.STRING('A',5),
DBMS_RANDOM.STRING('A',15),
DBMS_RANDOM.STRING('A',15),
DBMS_RANDOM.STRING('A',15),
DBMS_RANDOM.STRING('A',15),
DBMS_RANDOM.STRING('A',15),
DBMS_RANDOM.STRING('A',15),
DBMS_RANDOM.STRING('A',15),
DBMS_RANDOM.STRING('A',15)
FROM
{a table with at least 100,000 rows}
WHERE
ROWNUM<=100000;
COMMIT;
Now, let's create three indexes, the first of which will be a unique
index on the two columns:
CREATE UNIQUE INDEX T1_IND1 ON T1(COLUMN_1,COLUMN_2);
CREATE INDEX T1_IND2 ON T1(COLUMN_1);
CREATE INDEX T1_IND3 ON T1(COLUMN_2);
Let's make certain that the table and index stats are current:
EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAM E=>'T1',CASCADE=>TRUE);
A simple select on only the two columns:
SELECT
COLUMN_1,
COLUMN_2
FROM
T1;
The DBMS_XPLAN:
------------------------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows |
A-Time | Buffers |
|
------------------------------------------------------------------------------------------
Quote:
1 | INDEX FAST FULL SCAN| T1_IND1 | 1 | 100K| 100K|
00:00:00.30 | 1375 |
|
------------------------------------------------------------------------------------------
In the above, Oracle only needed to visit the index, and avoided the
visit to the table.
--
SELECT
COLUMN_1,
COLUMN_2,
COLUMN_3
FROM
T1
WHERE
COLUMN_1<'BBB';
------------------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time | Buffers |
|
------------------------------------------------------------------------------------
Quote:
* 1 | TABLE ACCESS FULL| T1 | 1 | 1747 | 1912 |
00:00:00.02 | 2149 |
|
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COLUMN_1"<'BBB')
In the above, Oracle determined that the best access method is a full
table scan, even though there were two indexes on COLUMN_1 that could
have been used.
--
SELECT
COLUMN_1,
COLUMN_2,
COLUMN_3
FROM
T1
WHERE
COLUMN_1<'BBB'
AND COLUMN_2<'BBB';
-------------------------------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-
Rows | A-Time | Buffers |
|
-------------------------------------------------------------------------------------------------
Quote:
1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 31 |
34 |00:00:00.01 | 43 |
* 2 | INDEX RANGE SCAN | T1_IND1 | 1 | 31 |
34 |00:00:00.01 | 9 |
|
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COLUMN_1"<'BBB' AND "COLUMN_2"<'BBB')
filter("COLUMN_2"<'BBB')
In the above, Oracle determined that the number of rows that would be
returned would be very small (0.031% of the rows), so the index based
access appeared to be the best option.
--
SELECT
COLUMN_1,
COLUMN_2,
COLUMN_3
FROM
T1
WHERE
COLUMN_2<'BBB';
------------------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time | Buffers |
|
------------------------------------------------------------------------------------
Quote:
* 1 | TABLE ACCESS FULL| T1 | 1 | 1747 | 2080 |
00:00:00.02 | 2148 |
|
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COLUMN_2"<'BBB')
In the above, Oracle predicted that 1747 rows (1.75%) of the rows in
the table would be returned, and still performed a full table scan,
rather than using one of the two available indexes.
--
SELECT
COLUMN_2,
COUNT(*)
FROM
T1
GROUP BY
COLUMN_2
HAVING
COUNT(*)>1;
--------------------------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows
A-Time | Buffers |
--------------------------------------------------------------------------------------------
* 1 | FILTER | | 1 | | 10 |
00:00:00.89 | 244 |
2 | HASH GROUP BY | | 1 | 5000 | 99990 |
00:00:00.66 | 244 |
3 | INDEX FAST FULL SCAN| T1_IND3 | 1 | 100K| 100K|
00:00:00.20 | 244 |
|
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(COUNT(*)>1)
In the above, Oracle was able to return the results by only visiting
the index on COLUMN_2.
--
Let's repeat the above, this time forcing Oracle to use the composite
index:
SELECT /*+ INDEX(T1,T1_IND1) */
COLUMN_2,
COUNT(*)
FROM
T1
GROUP BY
COLUMN_2
HAVING
COUNT(*)>1;
---------------------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time | Buffers |
|
---------------------------------------------------------------------------------------
Quote:
* 1 | FILTER | | 1 | | 10 |
00:00:00.90 | 377 |
2 | HASH GROUP BY | | 1 | 5000 | 99990 |
00:00:00.66 | 377 |
3 | INDEX FULL SCAN| T1_IND1 | 1 | 100K| 100K|
00:00:00.20 | 377 |
|
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(COUNT(*)>1)
In this case, Oracle required a little more time to return the result,
and was still able to satisfy the query using only the composite index
on COLUMN_1,COLUMN_2.
--
Let's try another query that cannot be satisfied using just an index,
as it includes COLUMN_3 (this query specifies the 10 matching COLUMN_2
values returned by the previous query):
SELECT
COLUMN_1,
COLUMN_2,
COLUMN_3
FROM
T1
WHERE
COLUMN_2 IN (
'WJTai',
'lADDY',
'XyjEy',
'YbVSm',
'oGbDL',
'vAVOV',
'FBuTH',
'Omkfw',
'mGjHX',
'xywzx');
--------------------------------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-
Rows | A-Time | Buffers |
|
--------------------------------------------------------------------------------------------------
Quote:
1 | INLIST ITERATOR | | 1 | |
20 |00:00:00.01 | 40 |
2 | TABLE ACCESS BY INDEX ROWID| T1 | 10 | 10 |
20 |00:00:00.01 | 40 |
* 3 | INDEX RANGE SCAN | T1_IND3 | 10 | 10 |
20 |00:00:00.01 | 20 |
|
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access(("COLUMN_2"='FBuTH' OR "COLUMN_2"='Omkfw' OR
"COLUMN_2"='WJTai' OR
"COLUMN_2"='XyjEy' OR "COLUMN_2"='YbVSm' OR
"COLUMN_2"='lADDY' OR "COLUMN_2"='mGjHX' OR
"COLUMN_2"='oGbDL' OR "COLUMN_2"='vAVOV' OR
"COLUMN_2"='xywzx'))
Oracle quickly returned the 20 matching rows by probing the T1_IND3
index 10 times.
--
Let's try the same using the composite index:
SELECT /*+ INDEX(T1,T1_IND1) */
COLUMN_1,
COLUMN_2,
COLUMN_3
FROM
T1
WHERE
COLUMN_2 IN (
'WJTai',
'lADDY',
'XyjEy',
'YbVSm',
'oGbDL',
'vAVOV',
'FBuTH',
'Omkfw',
'mGjHX',
'xywzx');
-------------------------------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-
Rows | A-Time | Buffers |
|
-------------------------------------------------------------------------------------------------
Quote:
1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 10 |
20 |00:00:00.03 | 397 |
* 2 | INDEX FULL SCAN | T1_IND1 | 1 | 10 |
20 |00:00:00.03 | 377 |
|
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("COLUMN_2"='FBuTH' OR "COLUMN_2"='Omkfw' OR
"COLUMN_2"='WJTai' OR
"COLUMN_2"='XyjEy' OR "COLUMN_2"='YbVSm' OR
"COLUMN_2"='lADDY' OR "COLUMN_2"='mGjHX' OR
"COLUMN_2"='oGbDL' OR "COLUMN_2"='vAVOV' OR
"COLUMN_2"='xywzx'))
The above required three times as long to execute, and the index
lookup changed from an index range scan to an index full scan.
--
Let's try an inline view to restrict the query results to 100 rows:
SELECT
T1.COLUMN_1,
T1.COLUMN_2,
T1.COLUMN_3,
T1.COLUMN_4
FROM
(SELECT
COLUMN_1,
COLUMN_2,
COLUMN_3
FROM
T1
WHERE
ROWNUM<=100) V_T1,
T1
WHERE
V_T1.COLUMN_1=T1.COLUMN_1
AND V_T1.COLUMN_2=T1.COLUMN_2;
--------------------------------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-
Rows | A-Time | Buffers |
|
--------------------------------------------------------------------------------------------------
Quote:
1 | NESTED LOOPS | | 1 | 100 |
100 |00:00:00.01 | 209 |
2 | VIEW | | 1 | 100 |
100 |00:00:00.01 | 7 |
* 3 | COUNT STOPKEY | | 1 | |
100 |00:00:00.01 | 7 |
4 | INDEX FAST FULL SCAN | T1_IND1 | 1 | 100K|
100 |00:00:00.01 | 7 |
5 | TABLE ACCESS BY INDEX ROWID| T1 | 100 | 1 |
100 |00:00:00.01 | 202 |
* 6 | INDEX UNIQUE SCAN | T1_IND1 | 100 | 1 |
100 |00:00:00.01 | 102 |
|
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(ROWNUM<=100)
6 - access("V_T1"."COLUMN_1"="T1"."COLUMN_1" AND
"V_T1"."COLUMN_2"="T1"."COLUMN_2")
In the above, the inline view was satisfied using an INDEX FAST FULL
SCAN on the T1_IND1 index, without visiting the table, even though
COLUMN_3 is included in the inline view - a smart optimization by
Oracle.
--
Let's try again, this time also matching COLUMN_3 from the inline
view:
SELECT
T1.COLUMN_1,
T1.COLUMN_2,
T1.COLUMN_3,
T1.COLUMN_4
FROM
(SELECT
COLUMN_1,
COLUMN_2,
COLUMN_3
FROM
T1
WHERE
ROWNUM<=100) V_T1,
T1
WHERE
V_T1.COLUMN_1=T1.COLUMN_1
AND V_T1.COLUMN_2=T1.COLUMN_2
AND V_T1.COLUMN_3=T1.COLUMN_3;
--------------------------------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-
Rows | A-Time | Buffers |
|
--------------------------------------------------------------------------------------------------
Quote:
1 | NESTED LOOPS | | 1 | 1 |
100 |00:00:00.01 | 208 |
2 | VIEW | | 1 | 100 |
100 |00:00:00.01 | 6 |
* 3 | COUNT STOPKEY | | 1 | |
100 |00:00:00.01 | 6 |
4 | TABLE ACCESS FULL | T1 | 1 | 100K|
100 |00:00:00.01 | 6 |
* 5 | TABLE ACCESS BY INDEX ROWID| T1 | 100 | 1 |
100 |00:00:00.01 | 202 |
* 6 | INDEX UNIQUE SCAN | T1_IND1 | 100 | 1 |
100 |00:00:00.01 | 102 |
|
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(ROWNUM<=100)
5 - filter("V_T1"."COLUMN_3"="T1"."COLUMN_3")
6 - access("V_T1"."COLUMN_1"="T1"."COLUMN_1" AND
"V_T1"."COLUMN_2"="T1"."COLUMN_2")
In the above, Oracle determined that the T1_IND1 index alone could not
satisfy the inline view, as the value of COLUMN_3 is needed in the
WHERE clause - Oracle performed a full table scan (stopping after
reading 100 rows) in the inline view, rather than use an index. This
query will likely return different rows than the previous query.
--
Let's remove the COLUMN_1 line from the WHERE clause:
SELECT
T1.COLUMN_1,
T1.COLUMN_2,
T1.COLUMN_3,
T1.COLUMN_4
FROM
(SELECT
COLUMN_1,
COLUMN_2,
COLUMN_3
FROM
T1
WHERE
ROWNUM<=100) V_T1,
T1
WHERE
V_T1.COLUMN_2=T1.COLUMN_2
AND V_T1.COLUMN_3=T1.COLUMN_3;
-------------------------------------------------------------------------------------------------
Quote:
Id | Operation | Name | Starts | E-Rows | A-
Rows | A-Time | Buffers |
|
-------------------------------------------------------------------------------------------------
Quote:
* 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 |
100 |00:00:00.01 | 113 |
2 | NESTED LOOPS | | 1 | 100 |
201 |00:00:00.01 | 110 |
3 | VIEW | | 1 | 100 |
100 |00:00:00.01 | 6 |
* 4 | COUNT STOPKEY | | 1 | |
100 |00:00:00.01 | 6 |
5 | TABLE ACCESS FULL | T1 | 1 | 100K|
100 |00:00:00.01 | 6 |
* 6 | INDEX RANGE SCAN | T1_IND3 | 100 | 1 |
100 |00:00:00.01 | 104 |
|
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("V_T1"."COLUMN_3"="T1"."COLUMN_3")
4 - filter(ROWNUM<=100)
6 - access("V_T1"."COLUMN_2"="T1"."COLUMN_2")
In the above, Oracle performed a full table scan (stopping after
reading 100 rows) in the inline view, and this time used the index on
COLUMN_2 to find the matching rows in T1.
A look at a 10053 trace file for your queries, as well as
experimentation with DBMS_XPLAN might help you make the right
decision, rather than relying on broad rules of thumb.
Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.