dbTalk Databases Forums  

Compare 2 tables

comp.databases.oracle.misc comp.databases.oracle.misc


Discuss Compare 2 tables in the comp.databases.oracle.misc forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
coral.ressources@gmail.com
 
Posts: n/a

Default Compare 2 tables - 03-08-2009 , 11:16 AM






I have 2 tables: Input and Output.
Input table stores records in this manner:
recno, prod_first, prod_span, prod_last
1, 1, 1000, 1000
2, 1001, 700, 1700
3, 1701, 100, 1800
....and so on
prod_first is the id of the product,
prod_span is the number of products stored by recno,
prod_last is the last product stored by recno

Output tables stores records related to outgoings:
recno, prod_first, prod_span, prod_last
1, 500, 25, 525
2, 300, 100, 400
3, 1243, 100, 1343
That mean productss no from 500 to 525, 300 to 400 and 1243 to 1343
were delivered, while the rest from input table are still in stock.
Question: how to find what is [still] in input table and not [yet] in
output table?
prod_first to prod_last are the items to be found in both tables.
I tought like this:
1. make 2 temp tables: input_temp, output_temp where i would insert
all records [one-by-one] since in original tables i have only start &
end.
2. make a comparison between the 2 temp tables (using SQL aggregation)
I am concerned that step 1 would hit the db performance as i have for
input table aprox 100 000 [and growing!] records to insert.


Reply With Quote
  #2  
Old   
Shakespeare
 
Posts: n/a

Default Re: Compare 2 tables - 03-09-2009 , 05:00 AM






coral.ressources (AT) gmail (DOT) com schreef:
Quote:
I have 2 tables: Input and Output.
Input table stores records in this manner:
recno, prod_first, prod_span, prod_last
1, 1, 1000, 1000
2, 1001, 700, 1700
3, 1701, 100, 1800
...and so on
prod_first is the id of the product,
prod_span is the number of products stored by recno,
prod_last is the last product stored by recno

Output tables stores records related to outgoings:
recno, prod_first, prod_span, prod_last
1, 500, 25, 525
2, 300, 100, 400
3, 1243, 100, 1343
That mean productss no from 500 to 525, 300 to 400 and 1243 to 1343
were delivered, while the rest from input table are still in stock.
Question: how to find what is [still] in input table and not [yet] in
output table?
prod_first to prod_last are the items to be found in both tables.
I tought like this:
1. make 2 temp tables: input_temp, output_temp where i would insert
all records [one-by-one] since in original tables i have only start &
end.
2. make a comparison between the 2 temp tables (using SQL aggregation)
I am concerned that step 1 would hit the db performance as i have for
input table aprox 100 000 [and growing!] records to insert.

In your system, is it possible to have as outgoing:

1, 1650,100,1750 ? So: overlap on boundaries of ingoing?

Shakespeare


Reply With Quote
  #3  
Old   
Shakespeare
 
Posts: n/a

Default Re: Compare 2 tables - 03-09-2009 , 05:11 AM



coral.ressources (AT) gmail (DOT) com schreef:
Quote:
I have 2 tables: Input and Output.
Input table stores records in this manner:
recno, prod_first, prod_span, prod_last
1, 1, 1000, 1000
2, 1001, 700, 1700
3, 1701, 100, 1800
...and so on
prod_first is the id of the product,
prod_span is the number of products stored by recno,
prod_last is the last product stored by recno

Output tables stores records related to outgoings:
recno, prod_first, prod_span, prod_last
1, 500, 25, 525
2, 300, 100, 400
3, 1243, 100, 1343
That mean productss no from 500 to 525, 300 to 400 and 1243 to 1343
were delivered, while the rest from input table are still in stock.
Question: how to find what is [still] in input table and not [yet] in
output table?
prod_first to prod_last are the items to be found in both tables.
I tought like this:
1. make 2 temp tables: input_temp, output_temp where i would insert
all records [one-by-one] since in original tables i have only start &
end.
2. make a comparison between the 2 temp tables (using SQL aggregation)
I am concerned that step 1 would hit the db performance as i have for
input table aprox 100 000 [and growing!] records to insert.

My first idea would be:
1) Take (a copy of) your input table as a start (in_stock table)
2) For each outputrecord, split records in in_stock and do some calculation
3) Keep the result table, and write triggers on input and output tables
to keep the in_stock table up to date (or make your in_stock table up to
date every now and than with a batch job, only processing new input and
output records in some way)

This approach would stress your system only once.
I know it is not 3NF, for it contains data that can be derived from
other tables, but may help boost your system performance!

You'll have to put in some error checking though, for output may
contain records that are not (yet) in stock (if possible) or may break
your input table record boundaries (see my previous post).


Shakespeare


Reply With Quote
  #4  
Old   
coral.ressources@gmail.com
 
Posts: n/a

Default Re: Compare 2 tables - 03-09-2009 , 06:44 AM



On Mar 9, 1:00*pm, Shakespeare <what... (AT) xs4all (DOT) nl> wrote:
Quote:
coral.ressour... (AT) gmail (DOT) com schreef:



I have 2 tables: Input and Output.
Input table stores records in this manner:
recno, prod_first, prod_span, prod_last
1, 1, 1000, 1000
2, 1001, 700, 1700
3, 1701, 100, 1800
...and so on
prod_first is the id of the product,
prod_span is the number of products stored by recno,
prod_last is the last product stored by recno

Output tables stores records related to outgoings:
recno, prod_first, prod_span, prod_last
1, 500, 25, 525
2, 300, 100, 400
3, 1243, 100, 1343
That mean productss no from 500 to 525, 300 to 400 and 1243 to 1343
were delivered, while the rest from input table are still in stock.
Question: how to find what is [still] in input table and not [yet] in
output table?
prod_first to prod_last are the items to be found in both tables.
I tought like this:
1. make 2 temp tables: input_temp, output_temp where i would insert
all records [one-by-one] since in original tables i have only start &
end.
2. make a comparison between the 2 temp tables (using SQL aggregation)
I am concerned that step 1 would hit the db performance as i have for
input table aprox 100 000 [and growing!] records to insert.

In your system, is it possible to have as outgoing:

1, 1650,100,1750 *? So: overlap on boundaries of ingoing?

Shakespeare
yes, we can have, but interval could not be overlapped; for instance:
i can have as outgoing following records:
1, 500, 25, 525
2, 300, 100, 400
but not:
1, 500, 25, 525
2, 300, 300, 600
because 2. record would overlap something that is [already] out...


Reply With Quote
  #5  
Old   
Ed Prochak
 
Posts: n/a

Default Re: Compare 2 tables - 03-09-2009 , 07:40 AM



On Mar 9, 7:44*am, coral.ressour... (AT) gmail (DOT) com wrote:
Quote:
On Mar 9, 1:00*pm, Shakespeare <what... (AT) xs4all (DOT) nl> wrote:



coral.ressour... (AT) gmail (DOT) com schreef:

I have 2 tables: Input and Output.
Input table stores records in this manner:
recno, prod_first, prod_span, prod_last
1, 1, 1000, 1000
2, 1001, 700, 1700
3, 1701, 100, 1800
...and so on
prod_first is the id of the product,
prod_span is the number of products stored by recno,
prod_last is the last product stored by recno

Output tables stores records related to outgoings:
recno, prod_first, prod_span, prod_last
1, 500, 25, 525
2, 300, 100, 400
3, 1243, 100, 1343
That mean productss no from 500 to 525, 300 to 400 and 1243 to 1343
were delivered, while the rest from input table are still in stock.
Question: how to find what is [still] in input table and not [yet] in
output table?
prod_first to prod_last are the items to be found in both tables.
I tought like this:
1. make 2 temp tables: input_temp, output_temp where i would insert
all records [one-by-one] since in original tables i have only start &
end.
2. make a comparison between the 2 temp tables (using SQL aggregation)
I am concerned that step 1 would hit the db performance as i have for
input table aprox 100 000 [and growing!] records to insert.

In your system, is it possible to have as outgoing:

1, 1650,100,1750 *? So: overlap on boundaries of ingoing?

Shakespeare

yes, we can have, but interval could not be overlapped; for instance:
i can have as outgoing following records:
1, 500, 25, 525
2, 300, 100, 400
but not:
1, 500, 25, 525
2, 300, 300, 600
because 2. record would overlap something that is [already] out...
So really the question is which is faster:

A. loading a couple temp tables and letting the SQL engine do the
comparison
this seems the "brute force" solution. It will work. I think the slow
part of this is loading the temp tables. Properly indexed (use IOTs?),
the comparison should not be a problem.

B. running a PL/SQL procedure that does the comparison and stores it
in some temp table.
This puts all the burden in the developers hands. Generally PL/SQL is
slower than SQL, but this might take advantage of some knowledge of
the tables. there will be two open cursors and I cannot figure out a
algorithm that could be any faster than row by row comparisons.

It seems to me that it should not take much time to write and run
both. Run some timings and let the best plan win. 100,000 rows is not
that many really. Even if that is the row count for the input/output
tables, then the temp tables would be on order of 100,000,000 rows,
but each row would be just the ID. I actually think I would have
designed this using tables more like your temp tables. The net space
used would be less and these comparisons would be easier.

Given no overlap in either input or output tables. But I think the
question was:
is there overlap between input rows and output rows? IOW,
given input table rows like:
95, 301,100,400
96, 401,200,600

can there exist an output table row of:
67, 351,200,550
???


Sorry if this is not much help so far. This sounds interesting. I'll
think about it some more. It sure looks like some memory allocation
schemes I have seen and used. I think I'd lean to a PL/SQL process and
one other table to store results.
Ed





Reply With Quote
  #6  
Old   
coral.ressources@gmail.com
 
Posts: n/a

Default Re: Compare 2 tables - 03-09-2009 , 09:17 AM



On Mar 9, 3:40*pm, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote:
Quote:
On Mar 9, 7:44*am, coral.ressour... (AT) gmail (DOT) com wrote:



On Mar 9, 1:00*pm, Shakespeare <what... (AT) xs4all (DOT) nl> wrote:

coral.ressour... (AT) gmail (DOT) com schreef:

I have 2 tables: Input and Output.
Input table stores records in this manner:
recno, prod_first, prod_span, prod_last
1, 1, 1000, 1000
2, 1001, 700, 1700
3, 1701, 100, 1800
...and so on
prod_first is the id of the product,
prod_span is the number of products stored by recno,
prod_last is the last product stored by recno

Output tables stores records related to outgoings:
recno, prod_first, prod_span, prod_last
1, 500, 25, 525
2, 300, 100, 400
3, 1243, 100, 1343
That mean productss no from 500 to 525, 300 to 400 and 1243 to 1343
were delivered, while the rest from input table are still in stock.
Question: how to find what is [still] in input table and not [yet] in
output table?
prod_first to prod_last are the items to be found in both tables.
I tought like this:
1. make 2 temp tables: input_temp, output_temp where i would insert
all records [one-by-one] since in original tables i have only start&
end.
2. make a comparison between the 2 temp tables (using SQL aggregation)
I am concerned that step 1 would hit the db performance as i have for
input table aprox 100 000 [and growing!] records to insert.

In your system, is it possible to have as outgoing:

1, 1650,100,1750 *? So: overlap on boundaries of ingoing?

Shakespeare

yes, we can have, but interval could not be overlapped; for instance:
i can have as outgoing following records:
1, 500, 25, 525
2, 300, 100, 400
but not:
1, 500, 25, 525
2, 300, 300, 600
because 2. record would overlap something that is [already] out...

So really the question is which is faster:

A. loading a couple temp tables and letting the SQL engine do the
comparison
this seems the "brute force" solution. It will work. I think the slow
part of this is loading the temp tables. Properly indexed (use IOTs?),
the comparison should not be a problem.

B. running a PL/SQL procedure that does the comparison and stores it
in some temp table.
This puts all the burden in the developers hands. Generally PL/SQL is
slower than SQL, but this might take advantage of some knowledge of
the tables. there will be two open cursors and I cannot figure out a
algorithm that could be any faster than row by row comparisons.

It seems to me that it should not take much time to write and run
both. Run some timings and let the best plan win. 100,000 rows is not
that many really. Even if that is the row count for the input/output
tables, then the temp tables would be on order of 100,000,000 rows,
but each row would be just the ID. I actually think I would have
designed this using tables more like your temp tables. The net space
used would be less and these comparisons would be easier.

Given no overlap in either input or output tables. But I think the
question was:
is there overlap between input rows and output rows? IOW,
given input table rows like:
95, 301,100,400
96, 401,200,600

can there exist an output table row of:
67, 351,200,550
???

Sorry if this is not much help so far. *This sounds interesting. I'll
think about it some more. It sure looks like some memory allocation
schemes I have seen and used. I think I'd lean to a PL/SQL process and
one other table to store results.
* *Ed
yes, such overlapping can exist. The idea is that input table has very
few rows with huge span between start and end product, while output
has slightly more rows but with little span between start and end.
Think about it like this : you buy EN-GROSS and sell EN-DETAILLE.


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.