dbTalk Databases Forums  

Ora 9.2 SQL: What's the fastest way to do this set operation...

comp.databases.oracle.misc comp.databases.oracle.misc


Discuss Ora 9.2 SQL: What's the fastest way to do this set operation... in the comp.databases.oracle.misc forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
dean
 
Posts: n/a

Default Ora 9.2 SQL: What's the fastest way to do this set operation... - 04-30-2006 , 12:04 AM






Imagine I have a database of 10,000 trains and their ordered list of
(around 2 million) scheduled locations, keyed on those trains and an
order number. I want to go from A to B and there are no direct trains.
I need to find all the pairs of trains, one going through A and the
other through B, that share AT LEAST ONE OTHER location in common, call
it C.

The only way I can think of doing it is to find the set of trains T_A
that go through A, and the set of trains T_B that go through B, and see
if there is any intersection in their route locations.

As far as I can tell, doing the INTERSECT means actually calculating
the entire intersection, even though I only need to know that at least
one location is common to both trains.

Is there a fast function in oracle (9.2) that will do something like
this - something along the lines of OVERLAPS.

If I do the join on LOCATION, or a count of locations in T_A that are
in T_B, I get the correct answer but it is taking 10 seconds+, and I'd
like to reduce that.

Thanks for any help.

Dean


Reply With Quote
  #2  
Old   
Brian Peasland
 
Posts: n/a

Default Re: Ora 9.2 SQL: What's the fastest way to do this set operation... - 04-30-2006 , 11:00 AM






You'll need to know that train T_A goes from point A to point C. So your
table will contain TRAIN_NAME (T_A), SOURCE point (A) and DEST point
(C). Similarly, you'll need to know that train T_B goes from point C to
point B. So your table will contain SOURCE point (C) and DEST point (B).
Now a simple SQL statment will find the two trains:

SELECT L1.TRAIN_NAME AS LEG1_TRAIN, L2.TRAIN_NAME AS LEG2_TRAIN
FROM TRAIN_SCHEDULES L1, TRAIN_SCHEDULES L2
WHERE L1.DEST = L2.SOURCE
AND L1.SOURCE = 'A' AND L2.DEST = 'B';

The above will give you any trains that start at point A for leg 1,
arrive at point B for leg 2, and share a common midpoint for the overall
leg.

The query above does not take into account that the LEG1_TRAIN may
arrive after the departure of the LEG2_TRAIN. So you might have to
modify the query to take that into account....

Cheers,
Brian


--
================================================== =================

Brian Peasland
oracle_dba (AT) nospam (DOT) peasland.net
http://www.peasland.net

Remove the "nospam." from the email address to email me.


"I can give it to you cheap, quick, and good.
Now pick two out of the three" - Unknown

Reply With Quote
  #3  
Old   
hansF
 
Posts: n/a

Default Re: Ora 9.2 SQL: What's the fastest way to do this set operation... - 04-30-2006 , 11:12 PM



On Sat, 29 Apr 2006 22:04:57 -0700, dean wrote:

Quote:
Imagine I have a database of 10,000 trains and their ordered list of
(around 2 million) scheduled locations, keyed on those trains and an
order number. I want to go from A to B and there are no direct trains.
I need to find all the pairs of trains, one going through A and the
other through B, that share AT LEAST ONE OTHER location in common, call
it C.

I'd probably imagine upgrading to Oracle10g and using it's Spatial
directed network capability and the routing engine.


Reply With Quote
  #4  
Old   
dean
 
Posts: n/a

Default Re: Ora 9.2 SQL: What's the fastest way to do this set operation... - 04-30-2006 , 11:15 PM



Hi Brian, thanks for the code. The problem is that this is a set theory
problem, and I am not a mathematician. Point C is not known, and there
may be many points that (what's known as a block swap) can occur in
both trains (in fact this is quite likely if they share part of a
corridor). In addition, the swap does not have to be at the first
train's destination, or the origin of the second train.

The problem is how to stop oracle determining the full intersection on
locations, and to stop when it finds _any_ shared occurence of a
location.

This is my statement just for reference. I was hoping the distinct
keyword would provide enough of a hint to oracle that it doesn't have
to find all the shared points C.

var LOCATION1 varchar2(30);
execute :LOCATION1 := '000006';
var LOCATION2 varchar2(30);
execute :LOCATION2 := 'ANB865';

select * from
(
with TER as (select * from train_expanded_route_008),
TRAINS_THRU_1 as (select distinct TRAIN_I, P_VERSION_I from TER where
location_i = :LOCATION1),
TRAINS_THRU_2 as (select distinct TRAIN_I, P_VERSION_I from TER where
location_i = :LOCATION2),
FULL_TR_THRU_1 as (select TRAIN_I, P_VERSION_I, ROUTE_ORDER_I,
LOCATION_I from TER where (TRAIN_I, P_VERSION_I) in (select * from
TRAINS_THRU_1)),
FULL_TR_THRU_2 as (select TRAIN_I, P_VERSION_I, ROUTE_ORDER_I,
LOCATION_I from TER where (TRAIN_I, P_VERSION_I) in (select * from
TRAINS_THRU_2))
select distinct
FULL_TR_THRU_1.TRAIN_I T1,
FULL_TR_THRU_1.P_VERSION_I P1,
FULL_TR_THRU_2.TRAIN_I T2,
FULL_TR_THRU_2.P_VERSION_I P2
from FULL_TR_THRU_1, FULL_TR_THRU_2
where FULL_TR_THRU_1.LOCATION_I = FULL_TR_THRU_2.LOCATION_I
and rownum < 20
);


Reply With Quote
  #5  
Old   
dean
 
Posts: n/a

Default Re: Ora 9.2 SQL: What's the fastest way to do this set operation... - 04-30-2006 , 11:20 PM



Here's the results for rownum < 20: Note it takes 20 secs.


T1 P1 T2 P2
---------- ---------- ---------- ----------
1661 1 49 1
1667 1 49 1
1727 1 49 1
1737 1 49 1
1759 1 49 1
1760 1 49 1
1762 1 49 1
1998 1 49 1
4228 1 49 1
4234 1 49 1
4236 1 49 1
5696 1 49 1
5792 1 49 1
5814 1 49 1
5837 1 49 1

15 rows selected.

Elapsed: 00:00:20.31


Reply With Quote
  #6  
Old   
Brian Peasland
 
Posts: n/a

Default Re: Ora 9.2 SQL: What's the fastest way to do this set operation... - 05-01-2006 , 11:05 AM



Quote:
The problem is that this is a set theory
problem, and I am not a mathematician.
Actually, this is not really a set theory problem. This is a network
flow problem. And yes...I not only play a mathematician on TV...I am one
in real life as well.

Quote:
Point C is not known,
My solution does not require point C to be known. My SQL statement will
determine the Point C's that let you get from point A to point B
(through C).

Quote:
and there
may be many points that (what's known as a block swap) can occur in
both trains (in fact this is quite likely if they share part of a
corridor). In addition, the swap does not have to be at the first
train's destination, or the origin of the second train.
My solution only gave an answer in going from point A to B provided only
one middle point was needed, but there may be many other midpoints in
the path. In the study of network flows, this is called a "search".
There are two different types of searching...depth first and breadth
first. These searches typically use recursive techniques to solve, but
are not limited to those.

While you may have a search involved, you may also want to perform some
"shortest path" analysis as well. Why send a train through Seattle when
the train needs to go from Los Angeles to Miami? Surely a shorter path
from LA to Miami exists without having to go all the way up to Seattle.
One of the most popular shortest path approaches in use is Dijkstra's
Algorithm. Note, the shortest path is typically (but no limited to)
either distance or time. The following shows a good description of
Dijkstra's Algorithm:

http://en.wikipedia.org/wiki/Dijkstra's_algorithm

They give pseudo-code on how to implement Dijkstra's Algorithm. From the
pseudo-code you can see that a PL/SQL implementation is best for your
type of problem. So you can code this in a Stored Procedure.

There are other shortest path algorithm's and some variations on
Dijkstra's Algorithm as well. A Google search can bring up tons of
information!


HTH,
Brian


--
================================================== =================

Brian Peasland
oracle_dba (AT) nospam (DOT) peasland.net
http://www.peasland.net

Remove the "nospam." from the email address to email me.


"I can give it to you cheap, quick, and good.
Now pick two out of the three" - Unknown


Reply With Quote
  #7  
Old   
AT
 
Posts: n/a

Default Re: Ora 9.2 SQL: What's the fastest way to do this set operation... - 05-01-2006 , 04:44 PM



While I think Brian is correct (as usual), and I'm too lazy to think
about this in detail (as usual), I think you might benefit from
examining explain plans and checking out
http://asktom.oracle.com/pls/ask/f?p...62364503028143
.. I suspect the distincts are forcing a lot of unnecessary sorts and
opine that playing around with rownum positioning controlling the
various selects, watching the effect on plans, might allow the
intersect pruning you desire. If you can get there, that might be
better than PL. Also, I'm unsure if rownum can short-circuit the sort
in having/group by, but if so that might be a way to get "at least
one."

jg
--
@home.com is bogus.
http://www.spectrum.ieee.org/mar06/3069/1


Reply With Quote
  #8  
Old   
dean
 
Posts: n/a

Default Re: Ora 9.2 SQL: What's the fastest way to do this set operation... - 05-03-2006 , 09:00 AM



I usually try with and without the distinct keywork in cases such as
this, and choose the fastest one.

So basically, there is no hack that can determine if two sets overlap,
without actually determining the intersect, in SQL. That's fine if that
is the answer.

Thanks all,

Dean


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.