Re: Performance question: Union vs OR -
08-30-2009
, 05:51 AM
erantz_dba_db2, That is not an easy question to answer as performance can involve other things such as Number of Rows, Indexes, and when Runstats was run. To complicate this, DB2 can rewrite your query before it goes through the optimizer.
However, in general, the OR will be better than the Union. Keeping this simple with one table,
{code}SELECT col-list
FROM table1
WHERE col1 = x
OR col2 = Y{code}
will evaluate the rows in the table once, checking both columns.
While this UNION query,
{code}SELECT col-list
FROM table1
WHERE col1 = x
UNION ALL
SELECT col-list
FROM table1
WHERE col2 = y{code}
will read the rows in the table twice (once for each select query). In addition, if you used a UNION instead of a UNION ALL, a Sort will be needed to remove any potential duplicate rows.
However, I have seen that if you query has this format:
{code}SELECT col-list
FROM table1
WHERE col3 = Z
AND ( col1 = x
OR col2 = Y
){code}
Any Index on COL1 and/or COL2 are NOT used. If this happens and it causes a table space scan on a large table, it can be more efficient to eliminate the AND ( X OR Y) by using a UNION ALL if Indexes are available.
{code}SELECT col-list
FROM table1
WHERE col3 = Z
AND col2 = x
UNION ALL
SELECT col-list
FROM table1
WHERE col3 = Z
AND col1 = y{code}
The best thing to do is make sure Runstats are up to date and run the various queries through Explain to see how DB2 handles each variation and run the queries on a representative volume of data.
PS For more complex multi-table joins and/or Subqueries, there are too many factors involved to give good answer for what works best. In those cases, be flexible and consider other ways of writing the query (Common Table Expression vs. Nested Table, Join vs. Subquery, etc.) and use Explain. |