dbTalk Databases Forums  

duplicate checking

comp.databases.ms-sqlserver comp.databases.ms-sqlserver


Discuss duplicate checking in the comp.databases.ms-sqlserver forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Analizer1
 
Posts: n/a

Default duplicate checking - 01-16-2008 , 01:04 PM






Hi Hope you can help....

What would be the most efficient way to de dup a table
I have like 10 columns i need to Check against....

our business rule is any incomming information
can't be within today -90 , ie 90 days

a Example would be helpful
Thanks





Reply With Quote
  #2  
Old   
Plamen Ratchev
 
Posts: n/a

Default Re: duplicate checking - 01-16-2008 , 03:31 PM






Not sure I understand the requirements correctly, but here is an example of
removing duplicates based on multiple columns, within the last 90 days.
Requires SQL Server 2005.

CREATE TABLE Foo (
dcol1 INT,
dcol2 INT,
dcol3 INT,
ddate DATETIME);

INSERT INTO Foo VALUES (1, 2, 3, '20080101');
INSERT INTO Foo VALUES (1, 2, 3, '20070101');
INSERT INTO Foo VALUES (1, 2, 3, '20070101');
INSERT INTO Foo VALUES (1, 2, 3, '20070801');
INSERT INTO Foo VALUES (1, 2, 3, '20071101');
INSERT INTO Foo VALUES (1, 2, 3, '20071101');
INSERT INTO Foo VALUES (1, 2, 3, '20080101');
INSERT INTO Foo VALUES (2, 2, 3, '20080101');

WITH FooCTE
AS
( SELECT dcol1, dcol2, dcol3, ddate,
ROW_NUMBER() OVER(
PARTITION BY dcol1, dcol2, dcol3
ORDER BY ddate) AS rn
FROM Foo
WHERE ddate >= DATEADD(day, DATEDIFF(day, 0, CURRENT_TIMESTAMP), -90)
AND ddate < DATEADD(day, DATEDIFF(day, 0, CURRENT_TIMESTAMP), 1)
)
DELETE FROM FooCTE
WHERE rn > 1;

SELECT dcol1, dcol2, dcol3, ddate
FROM Foo;

HTH,

Plamen Ratchev
http://www.SQLStudio.com


Reply With Quote
  #3  
Old   
Analizer1
 
Posts: n/a

Default Re: duplicate checking - 01-18-2008 , 09:51 AM



thanks alot for the example..i'll be testing ...look somewhat what i
need...

thanks again

"Plamen Ratchev" <Plamen (AT) SQLStudio (DOT) com> wrote

Quote:
Not sure I understand the requirements correctly, but here is an example
of removing duplicates based on multiple columns, within the last 90 days.
Requires SQL Server 2005.

CREATE TABLE Foo (
dcol1 INT,
dcol2 INT,
dcol3 INT,
ddate DATETIME);

INSERT INTO Foo VALUES (1, 2, 3, '20080101');
INSERT INTO Foo VALUES (1, 2, 3, '20070101');
INSERT INTO Foo VALUES (1, 2, 3, '20070101');
INSERT INTO Foo VALUES (1, 2, 3, '20070801');
INSERT INTO Foo VALUES (1, 2, 3, '20071101');
INSERT INTO Foo VALUES (1, 2, 3, '20071101');
INSERT INTO Foo VALUES (1, 2, 3, '20080101');
INSERT INTO Foo VALUES (2, 2, 3, '20080101');

WITH FooCTE
AS
( SELECT dcol1, dcol2, dcol3, ddate,
ROW_NUMBER() OVER(
PARTITION BY dcol1, dcol2, dcol3
ORDER BY ddate) AS rn
FROM Foo
WHERE ddate >= DATEADD(day, DATEDIFF(day, 0, CURRENT_TIMESTAMP), -90)
AND ddate < DATEADD(day, DATEDIFF(day, 0, CURRENT_TIMESTAMP), 1)
)
DELETE FROM FooCTE
WHERE rn > 1;

SELECT dcol1, dcol2, dcol3, ddate
FROM Foo;

HTH,

Plamen Ratchev
http://www.SQLStudio.com



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.