![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker b.pos_marker) ); One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Any help would be greatly appreciated! Jeff C. |
#3
| |||
| |||
|
|
On Jun 28, 7:51*pm, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker b.pos_marker) ); One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Any help would be greatly appreciated! Jeff C. Do you have a primary key on this Trafficdata table? Assuming your results are the rows your really want to delete then a delete of the form: DELETE FROM Trafficdata WHERE pk IN (SELECT a.pk from ...) ; I suggest you do this AFTER doing a backup. Ed- Hide quoted text - - Show quoted text - |
#4
| |||
| |||
|
|
On Jun 28, 7:51*pm, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker b.pos_marker) ); One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Any help would be greatly appreciated! Jeff C. Do you have a primary key on this Trafficdata table? Assuming your results are the rows your really want to delete then a delete of the form: DELETE FROM Trafficdata WHERE pk IN (SELECT a.pk from ...) ; I suggest you do this AFTER doing a backup. Ed |
#5
| |||
| |||
|
|
Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date |
|
AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker < b.pos_marker) ); |
|
One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" |
#6
| |||
| |||
|
|
On Jun 29, 9:51 am, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date Sic? AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker < b.pos_marker) ); Note that if there are rows with the same (id, date, flag = 'Q', dstop) tuple, but different batch numbers, then you'll get every row for that tuple. Is that really what you want? Or do you only want to remove all but the the top pos_markers for every (id, date, flag = 'Q', dstop, batch) tuple? Or something else...? Note also that this will potentially give you multiple copies of some rows from a. One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Hint: A better select query might be... select a.* from trafficdata a where a.flag = 'Q' and exists (select 1 from trafficdata b where b.id = a.id and b.date = a.date and b.flag = a.flag and b.dstop = a.dstop and ( b.batch != a.batch or b.batch = a.batch and b.pos_marker > a.pos_marker)); -- Peter |
#7
| |||
| |||
|
|
On Jun 29, 7:03*am, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote: On Jun 28, 7:51*pm, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker b.pos_marker) ); One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Any help would be greatly appreciated! Jeff C. Do you have a primary key on this Trafficdata table? Assuming your results are the rows your really want to delete then a delete of the form: DELETE FROM Trafficdata WHERE pk IN (SELECT a.pk from ...) ; I suggest you do this AFTER doing a backup. Ed Thanks for your reply. No, there isn't don't have a primary key on this data table. I also can't make a backup of the TrafficData table, due to its huge size, but I did make a smaller table to experiment on. Jeff- Hide quoted text - - Show quoted text - |
#8
| |||
| |||
|
|
On Jun 29, 12:41*pm, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: On Jun 29, 7:03*am, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote: On Jun 28, 7:51*pm, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker b.pos_marker) ); One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Any help would be greatly appreciated! Jeff C. Do you have a primary key on this Trafficdata table? Assuming your results are the rows your really want to delete then a delete of the form: DELETE FROM Trafficdata WHERE pk IN (SELECT a.pk from ...) ; I suggest you do this AFTER doing a backup. Ed Thanks for your reply. No, there isn't don't have a primary key on this data table. I also can't make a backup of the TrafficData table, due to its huge size, but I did make a smaller table to experiment on. Jeff- Hide quoted text - - Show quoted text - The code I post will work when no PK or UK exists. *The key_list is just the list of columns that qualify the data to be considered duplicate. *If could be one column or all the columns in the row. HTH -- Mark D Powell -- |
#9
| |||
| |||
|
|
On Jun 29, 8:03 am, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote: On Jun 28, 7:51 pm, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker b.pos_marker) ); One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Any help would be greatly appreciated! Jeff C. Do you have a primary key on this Trafficdata table? Assuming your results are the rows your really want to delete then a delete of the form: DELETE FROM Trafficdata WHERE pk IN (SELECT a.pk from ...) ; I suggest you do this AFTER doing a backup. Ed- Hide quoted text - - Show quoted text - Assuming you want to keep one copy of each row the following will work delete from table_name where (key_list, rowid) in ( select keys, rowid from table_name minus select keys, min(rowid) from table_name group by keys ) / HTH -- Mark D Powell -- |
#10
| |||
| |||
|
|
Mark D Powell schreef: On Jun 29, 8:03 am, Ed Prochak <edproc... (AT) gmail (DOT) com> wrote: On Jun 28, 7:51 pm, Jeff Calico <jeffcal... (AT) gmail (DOT) com> wrote: Hello everyone I am trying to delete rows from a massive data table that are "almost" duplicates. I can do a nice query to find them when I join the table with itself, but I can't figure out how to form a delete statement out of this. Query: SELECT a.* FROM Trafficdata a, Trafficdata b WHERE a.id = b.id AND a.date = b.date AND a.flag = 'Q' AND b.flag = 'Q' AND a.dstop = b.dstop AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker b.pos_marker) ); One idea I tried after searching the web was DELETE FROM ( <above query> ); but that gives ORA-01752 "Cannot delete from view without exactly one key preserved table" Any help would be greatly appreciated! Jeff C. Do you have a primary key on this Trafficdata table? Assuming your results are the rows your really want to delete then a delete of the form: DELETE FROM Trafficdata WHERE pk IN (SELECT a.pk from ...) ; I suggest you do this AFTER doing a backup. Ed- Hide quoted text - - Show quoted text - Assuming you want to keep one copy of each row the following will work delete from table_name where (key_list, rowid) in ( select keys, rowid from table_name minus select keys, min(rowid) from table_name group by keys ) / HTH -- Mark D Powell -- Looks like you are skipping the condition that the record with the largest pos_marker should be preserved, not the lowest rowid. Furthermore, if a.batch != b.batch, both records would be deleted in the original query (if it worked). And is there no mutating table problem here? Won't this lead to 'snapshot too old' problems with large tables? (I'm not sure about that) Shakespeare |
![]() |
| Thread Tools | |
| Display Modes | |
| |