Monday, March 26, 2012

finding and removing duplicates rows

Say I had a table called 'RAS', with 4 columns:
Date, Time, UserName, GroupName
How can I check for any duplicate rows based on all those columns
combined.
And once they are found, can the duplicates be deleted?
It would be nice to see 2 solutions for learning purposes... one
statement just to show the duplicates, and another separate statement
to find and delete the duplicates.
Thanks in advance!
StephenCREATE TABLE #Demo (
idNo int identity(1,1),
colA int,
colB int
)
INSERT INTO #Demo(colA,colB) VALUES (1,6)
INSERT INTO #Demo(colA,colB) VALUES (1,6)
INSERT INTO #Demo(colA,colB) VALUES (2,4)
INSERT INTO #Demo(colA,colB) VALUES (3,3)
INSERT INTO #Demo(colA,colB) VALUES (4,2)
INSERT INTO #Demo(colA,colB) VALUES (3,3)
INSERT INTO #Demo(colA,colB) VALUES (5,1)
INSERT INTO #Demo(colA,colB) VALUES (8,1)
PRINT 'Table'
SELECT * FROM #Demo
PRINT 'Duplicates in Table'
SELECT * FROM #Demo
WHERE idNo IN
(SELECT B.idNo
FROM #Demo A JOIN #Demo B
ON A.idNo <> B.idNo
AND A.colA = B.colA
AND A.colB = B.colB)
PRINT 'Duplicates to Delete'
SELECT * FROM #Demo
WHERE idNo IN
(SELECT B.idNo
FROM #Demo A JOIN #Demo B
ON A.idNo < B.idNo -- < this time, not <>
AND A.colA = B.colA
AND A.colB = B.colB)
DELETE FROM #Demo
WHERE idNo IN
(SELECT B.idNo
FROM #Demo A JOIN #Demo B
ON A.idNo < B.idNo -- < this time, not <>
AND A.colA = B.colA
AND A.colB = B.colB)
PRINT 'Cleaned-up Table'
SELECT * FROM #Demo
DROP TABLE #Demo
"Stephen" <facetoface72@.hotmail.com> wrote in message
news:d6e21d67.0403160535.5ac3afa7@.posting.google.com...
> Say I had a table called 'RAS', with 4 columns:
> Date, Time, UserName, GroupName
> How can I check for any duplicate rows based on all those columns
> combined.
> And once they are found, can the duplicates be deleted?
> It would be nice to see 2 solutions for learning purposes... one
> statement just to show the duplicates, and another separate statement
> to find and delete the duplicates.
> Thanks in advance!
> Stephen|||Thanks for the message, it definately helped solve a lot of issues.
Is there a way to do everything mentioned, but without having a unique
key for each record?
"Uri Dimant" <urid@.iscar.co.il> wrote in message news:<#6uUEw1CEHA.1452@.TK2MSFTNGP09.phx.gb
l>...
> CREATE TABLE #Demo (
> idNo int identity(1,1),
> colA int,
> colB int
> )
> INSERT INTO #Demo(colA,colB) VALUES (1,6)
> INSERT INTO #Demo(colA,colB) VALUES (1,6)
> INSERT INTO #Demo(colA,colB) VALUES (2,4)
> INSERT INTO #Demo(colA,colB) VALUES (3,3)
> INSERT INTO #Demo(colA,colB) VALUES (4,2)
> INSERT INTO #Demo(colA,colB) VALUES (3,3)
> INSERT INTO #Demo(colA,colB) VALUES (5,1)
> INSERT INTO #Demo(colA,colB) VALUES (8,1)
> PRINT 'Table'
> SELECT * FROM #Demo
> PRINT 'Duplicates in Table'
> SELECT * FROM #Demo
> WHERE idNo IN
> (SELECT B.idNo
> FROM #Demo A JOIN #Demo B
> ON A.idNo <> B.idNo
> AND A.colA = B.colA
> AND A.colB = B.colB)
> PRINT 'Duplicates to Delete'
> SELECT * FROM #Demo
> WHERE idNo IN
> (SELECT B.idNo
> FROM #Demo A JOIN #Demo B
> ON A.idNo < B.idNo -- < this time, not <>
> AND A.colA = B.colA
> AND A.colB = B.colB)
> DELETE FROM #Demo
> WHERE idNo IN
> (SELECT B.idNo
> FROM #Demo A JOIN #Demo B
> ON A.idNo < B.idNo -- < this time, not <>
> AND A.colA = B.colA
> AND A.colB = B.colB)
> PRINT 'Cleaned-up Table'
> SELECT * FROM #Demo
> DROP TABLE #Demo
> "Stephen" <facetoface72@.hotmail.com> wrote in message
> news:d6e21d67.0403160535.5ac3afa7@.posting.google.com...|||On 17 Mar 2004 05:24:49 -0800, Stephen wrote:

>Thanks for the message, it definately helped solve a lot of issues.
>Is there a way to do everything mentioned, but without having a unique
>key for each record?
No. If all columns in two rows are the same, there's no way to tell
them apart. Any where clause will either select both rows or none of
them. If you want to eliminate duplicates, you have to add an IDENTITY
or GUID column, or else you will delete both rows instead of one of
the duplicates.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||DELETE *
FROM table3
WHERE sno not in (SELECT min(sno)
FROM table3
GROUP BY fd1, fd2, fd3);

No comments:

Post a Comment