![]() | |
![]() |
| | Thread Tools | Display Modes |
#11
| |||
| |||
|
|
Peter Nilsson schreef: ... Note that if there are rows with the same (id, date, flag = 'Q', dstop) tuple, but different batch numbers, then you'll get every row for that tuple. Is that really what you want?... snip Hint: A better select query might be... select a.* from trafficdata a where a.flag = 'Q' and exists (select 1 from trafficdata b where b.id = a.id and b.date = a.date and b.flag = a.flag and b.dstop = a.dstop and ( b.batch != a.batch or b.batch = a.batch and b.pos_marker > a.pos_marker)); I took a quick look, but if there are corresponding rows on (id, date, flag, dstop) where a.batch <> b.batch it looks like BOTH rows will be deleted. I don't know if that is what you really wanted... |
#12
| |||
| |||
|
|
On Jun 29, 7:03*am, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote: On Jun 28, 7:51*pm, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker b.pos_marker) ); One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Any help would be greatly appreciated! Jeff C. Do you have a primary key on this Trafficdata table? Assuming your results are the rows your really want to delete then a delete of the form: DELETE FROM Trafficdata WHERE pk IN (SELECT a.pk from ...) ; I suggest you do this AFTER doing a backup. Ed Thanks for your reply. No, there isn't don't have a primary key on this data table. I also can't make a backup of the TrafficData table, due to its huge size, but I did make a smaller table to experiment on. Jeff |
#13
| |||
| |||
|
|
Shakespeare <what... (AT) xs4all (DOT) nl> wrote: Peter Nilsson schreef: ... Note that if there are rows with the same (id, date, flag = 'Q', dstop) tuple, but different batch numbers, then you'll get every row for that tuple. Is that really what you want?... snip Hint: A better select query might be... select a.* from trafficdata a where a.flag = 'Q' and exists (select 1 from trafficdata b where b.id = a.id and b.date = a.date and b.flag = a.flag and b.dstop = a.dstop and ( b.batch != a.batch or b.batch = a.batch and b.pos_marker > a.pos_marker)); I took a quick look, but if there are corresponding rows on (id, date, flag, dstop) where a.batch <> b.batch it looks like BOTH rows will be deleted. I don't know if that is what you really wanted... Is there an echo in here?! :-) |
|
The query I posted replicates the semantics of the OP's original select, except that it doesn't repeat matching rows from the target table. The hint about my script maybe being better was that it's a one line change to make it delete instead of select rows. That sort of select is useful in situations like formal data change requests where you have to confirm the affected rows with pre and post selects. Performing a review of the change script prior to running in production is much easier if the select and delete (or update) are all but identical. Of course, it's important to get the select right in the first place. ;-) |
|
-- Peter |
#14
| |||
| |||
|
|
Mark D Powell schreef: On Jun 29, 8:03 am, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote: On Jun 28, 7:51 pm, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker b.pos_marker) ); One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Any help would be greatly appreciated! Jeff C. Do you have a primary key on this Trafficdata table? Assuming your results are the rows your really want to delete then a delete of the form: DELETE FROM Trafficdata WHERE pk IN (SELECT a.pk from ...) ; I suggest you do this AFTER doing a backup. Ed- Hide quoted text - - Show quoted text - Assuming you want to keep one copy of each row the following will work delete from table_name * where (key_list, rowid) in * * * * ( select *keys, rowid from table_name * * * * * minus * * * * * select *keys, min(rowid) from table_name * * * * * group by keys * * * * ) / HTH -- Mark D Powell -- Looks like you are skipping the condition that the record with the largest pos_marker should be preserved, not the lowest rowid. Furthermore, if a.batch != b.batch, both records would be deleted in the original query (if it worked). And is there no mutating table problem here? Won't this lead to 'snapshot too old' problems with large tables? (I'm not sure about that) Shakespeare- Hide quoted text - - Show quoted text - |
#15
| |||
| |||
|
|
On Jun 30, 11:00 am, Shakespeare <what... (AT) xs4all (DOT) nl> wrote: Mark D Powell schreef: On Jun 29, 8:03 am, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote: On Jun 28, 7:51 pm, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker b.pos_marker) ); One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Any help would be greatly appreciated! Jeff C. Do you have a primary key on this Trafficdata table? Assuming your results are the rows your really want to delete then a delete of the form: DELETE FROM Trafficdata WHERE pk IN (SELECT a.pk from ...) ; I suggest you do this AFTER doing a backup. Ed- Hide quoted text - - Show quoted text - Assuming you want to keep one copy of each row the following will work delete from table_name where (key_list, rowid) in ( select keys, rowid from table_name minus select keys, min(rowid) from table_name group by keys ) / HTH -- Mark D Powell -- Looks like you are skipping the condition that the record with the largest pos_marker should be preserved, not the lowest rowid. Furthermore, if a.batch != b.batch, both records would be deleted in the original query (if it worked). And is there no mutating table problem here? Won't this lead to 'snapshot too old' problems with large tables? (I'm not sure about that) Shakespeare- Hide quoted text - - Show quoted text - The delete is a generic template. It would be very easy to modify it to keep the max whatever instead of the min rowid. Mutating tables happen only when a trigger is involved and you attempt to modify the base table from the trigger. |
|
There is no mention of a trigger; however there is mention of a view error. The delete should probably be aimed at one or more of the base tables in the view. |
|
Remove the extra rows from the base table then the view will display the right results since the data has been fixed. Either that or the duplcate data should exist and it is the view definition which needs fixing to filter out what is being called duplicate. I have seen both issues over my career. HTH -- Mark D Powell -- |
#16
| |||
| |||
|
|
On Jun 29, 9:51*am, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date Sic? AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker < b.pos_marker) ); Note that if there are rows with the same (id, date, flag = 'Q', dstop) tuple, but different batch numbers, then you'll get every row for that tuple. Is that really what you want? Or do you only want to remove all but the the top pos_markers for every (id, date, flag = 'Q', dstop, batch) tuple? Or something else...? Note also that this will potentially give you multiple copies of some rows from a. One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Hint: A better select query might be... * select a.* * * from trafficdata a * *where a.flag = 'Q' * * *and exists * * * * * *(select 1 * * * * * * * from trafficdata b * * * * * * *where b.id = a.id * * * * * * * *and b.date = a.date * * * * * * * *and b.flag = a.flag * * * * * * * *and b.dstop = a.dstop * * * * * * * *and ( * b.batch != a.batch * * * * * * * * * * or b.batch *= a.batch * * * * * * * * * * * * *and b.pos_marker > a.pos_marker)); -- Peter |
![]() |
| Thread Tools | |
| Display Modes | |
| |