Fragmented array: duplicate

Showing posts with label duplicate. Show all posts

Wednesday, March 28, 2012

Finding Duplicates

I have a company table and I would like to write a query that will return to
me any duplicate companies. However, it is a little more complicated then
just matching on exact company names. I would like it to give me duplicates
where x number of letters at the beginning of the company name match AND x
number of letters of the address match AND x number of letters of the city
match. I will be doing this in batches based on the first letter of the
company name. So for example I will first process all companies that start
with the letter "A".

So for all "A" companies I want to find companies where the first 5 letters
in the company name match and the first 5 characters of the address field
match and the first 5 characters of the city match. THANKS!!!Can you post simplified DDLs, some sample data & expected results? For
detail refer to : www.aspfaq.com/5006

--
- Anith
( Please reply to newsgroups only )|||"Erich" <erich93063@.hotmail.com> wrote in message news:<102gbomj7gc84f7@.corp.supernews.com>...
> I have a company table and I would like to write a query that will return to
> me any duplicate companies. However, it is a little more complicated then
> just matching on exact company names. I would like it to give me duplicates
> where x number of letters at the beginning of the company name match AND x
> number of letters of the address match AND x number of letters of the city
> match. I will be doing this in batches based on the first letter of the
> company name. So for example I will first process all companies that start
> with the letter "A".
> So for all "A" companies I want to find companies where the first 5 letters
> in the company name match and the first 5 characters of the address field
> match and the first 5 characters of the city match. THANKS!!!

Something like this may work:

select t.*
from dbo.MyTable t
join
(
select
left(CompanyName, 5) as 'CompName',
left(Address, 5) as 'Addr',
left(City, 5) as 'City',
count(*) as 'Dupes'
from
dbo.MyTable
where
left(CompanyName, 1) = 'A'
group by
left(CompanyName, 5),
left(Address, 5),
left(City, 5)
having
count(*) > 1
) dt
on dt.CompName = left(t.CompanyName, 5)
and dt.Addr = left(t.Address, 5)
and dt.City = left(t.City, 5)

If this doesn't work as you expect, then please consider posting your
table DDL, as well as some sample data.

Simon|||This may work as well, assuming a parameter is passed into the stored proc
or function for the first letter:

SELECT C1.CompanyID, C2.CompanyID,
C1.CompanyName, C2.CompanyName,
C1.Address, C2.Address, C1.City, C2.City
FROM Company C1 JOIN Company C2 ON
LEFT(C1.CompanyName, 5) = LEFT(C2.CompanyName, 5) AND
LEFT(C1.Address, 5) = LEFT(C2.Address, 5) AND
LEFT(C1.City, 5) = LEFT(C2.City, 5) AND
C1.CompanyID != C2.CompanyID
WHERE LEFT(C1.CompanyName, 1) = @.FirstLetter

You could also use a variable parameter instead of hard-coding "5" to allow
for more specific or more general matches.
ie, ... LEFT(C1.Company, @.MatchLength) = LEFT(C2.CompanyName, @.MatchLength)

"Erich" <erich93063@.hotmail.com> wrote in message
news:102gbomj7gc84f7@.corp.supernews.com...
> I have a company table and I would like to write a query that will return
to
> me any duplicate companies. However, it is a little more complicated then
> just matching on exact company names. I would like it to give me
duplicates
> where x number of letters at the beginning of the company name match AND x
> number of letters of the address match AND x number of letters of the city
> match. I will be doing this in batches based on the first letter of the
> company name. So for example I will first process all companies that start
> with the letter "A".
> So for all "A" companies I want to find companies where the first 5
letters
> in the company name match and the first 5 characters of the address field
> match and the first 5 characters of the city match. THANKS!!!

Finding duplicate values in fields

Hi everyone..

I have a table with about 40,000 records in it.
I want to find records where all of the values in all of the fields, except for the unique field are equal.

The records in this table are answers to a online survey.

So, I want to find all of the records where people answered the questions exactly the same as someone else.

Any help would be appreciated.

NickieWhat about this:

select list of fields
from yourtable
group by list of fields having count(*)>1

finding duplicate values

Here's my situation: I work at a healthcare testing labratory, I have a table with all our tests, Dim_Test. The two columns i want to look at are: [Mnemonic] and [Test Name]. Mnemonic is the short test name and Test Name is a long name. The condition of the data is that there are a few duplicate mnemonics, but the the duplicates have different long test names. I know this is really messed up but thats our lab system for you, i have no control over that. I want to make a table that has all mnemonics that are duplicated, and the Long test name for each of the duplications. I have come up with a couple possible solutions but i dont know how to implement them or which would be the best course of action.

The First: Sort on the Mnemonic and delete duplicates, store that to a temp table. Then somehow compare Dim_test and the temp table to find the rows missing from the temp table and then store that to, say, temp2. So temp2 would contain all the data that was deleted during the deletion of duplicates on Dim_test. Then i could use the input Dim_test and do a lookup on table temp2 which would give me all mnemonics and test names with duplicate mnemonics.

The Second and third: Somehow get a dataset that has the distinct mnemonic and the count of each distinct mnemonic. Then i could use a conditional split to find out which mnemonics have a count of greater than one. then i can store that to a temp table and lookup on that. That temp table would be the same as temp2 in the first solution. The difference between the second and third solutions is that one uses an aggregate transformation and the other uses a pivot transformation.

I think the pivot transformation would work best if i pivoted on mnemonic but i dont know how to set it up. Please help. Thank you.

You don't need SSIS for this. This T-SQL will do the trick:

Code Snippet

use tempdb

CREATE Table [Dim_Test]

(

[Mnemonic] NVARCHAR (20),

[Test Name] NVARCHAR (200)

)

INSERT INTO [Dim_Test] VALUES ('AAA', 'This is a long test name 1')

INSERT INTO [Dim_Test] VALUES ('BBB', 'This is a long test name 2')

INSERT INTO [Dim_Test] VALUES ('CCC', 'This is a long test name 3')

INSERT INTO [Dim_Test] VALUES ('DDD', 'This is a long test name 4')

INSERT INTO [Dim_Test] VALUES ('EEE', 'This is a long test name 5')

INSERT INTO [Dim_Test] VALUES ('AAA', 'This is a long test name 6')

INSERT INTO [Dim_Test] VALUES ('FFF', 'This is a long test name 7')

INSERT INTO [Dim_Test] VALUES ('BBB', 'This is a long test name 8')

INSERT INTO [Dim_Test] VALUES ('AAA', 'This is a long test name 9')

WITH DuplicatesCTE

(

SELECT [Mnemonic]

FROM [Dim_Test]

GROUP BY [Mnemonic]

HAVING COUNT (*) > 1

)

SELECT [Dim_Test].*

FROM [Dim_Test]

INNER JOIN [DuplicatesCTE]

ON [Dim_Test].[Mnemonic] = [DuplicatesCTE].[Mnemonic]

If your source system is using SQL 2000 you will have to replace the CTE with a derived table, but the technique is the same.

Does this give you what you need?

|||select a.mnemonic, a.[test name] from dim_test a inner join
(select mnemonic, count(*) from dim_test group by mnemonic having count(*) > 1) b
on a.mnemonic = b.mnemonic

This should give you a list of all of the duplicate mnemonics with their [test name]s.

If you want to create a NEW table with this information add an "into tablename" clause like so:

select a.mnemonic, a.[test name] into tablename
from dim_test a inner join
(select mnemonic, count(*) from dim_test group by mnemonic having count(*) > 1) b
on a.mnemonic = b.mnemonic|||

Both of these seemed to work, i ended up using Matthew's solution. I should really learn more about the SQL language, i pretty much do everything in SSIS. Thank you both very much.

Matthew could you please help me compact this. Please excuse my ignorance of the basic SQL language. So all the data is in Dim_test and the part i left out before was that there is also data with duplicate Mnemonics and Test Names example:

VALUES ('AAA', 'This is a long test name 1')

So this is what i want to do in a Stored procedure, and I need to do it without changing any data in Dim_test:

1. Disregard all duplicates of Mnemonic AND [Test Name] in Dim_test

2. DELETE FROM DupMnemonics

3. Store the results of your query in DupMnemonics

Thank you so much.

This is the query i ended up using

WITH DuplicatesCTE AS (SELECT Mnemonic
FROM DupMnemonics
GROUP BY Mnemonic
HAVING (COUNT(*) > 1))
SELECT DupMnemonics_1.Mnemonic, DupMnemonics_1.[Test Name]
FROM DupMnemonics AS DupMnemonics_1 INNER JOIN
DuplicatesCTE AS DuplicatesCTE_1 ON DupMnemonics_1.Mnemonic = DuplicatesCTE_1.Mnemonic

|||An outer SELECT DISTINCT should eliminate those duplicates.|||

You can use a CTE to do this as well, when used in combination with the ROW_NUMBER ranking funcctiokn. Here's an example:

http://www.eggheadcafe.com/software/aspnet/29974931/removing-duplicate-rows-.aspx

|||

MatthewRoche wrote:

You can use a CTE to do this as well, when used in combination with the ROW_NUMBER ranking funcctiokn. Here's an example:

http://www.eggheadcafe.com/software/aspnet/29974931/removing-duplicate-rows-.aspx

Yeah, to the OP, there are so many ways to achieve this, you may want to move your discussions to the Transact-SQL forum. This is, after all, off topic in the SSIS forum... |||

Is there a way to just move this thread over to T-SQL

Thank you all.

|||

heff89 wrote:

Is there a way to just move this thread over to T-SQL

Thank you all.

There is. I'll do that now for you.

Finding duplicate records using two columns

I'm trying to find duplicate records in a table. I know how to do that using
a single column value, but I need to do it using the combined values of two
columns, company_id and client_id. There multiple are records with the same
company_id value (e.g., 123) and multiple records with the same department_id
value (e.g., 456). But there should be only one record with a company_id
value of 123 and a department_id value of 456.
I've searched for data in one table that is not in another table on these
two columns using CAST to concatenate the values:
select * from company c
where cast(c.company_id as varchar(20)) + cast(c.client_id as varchar(20))
not in
(select cast(r.company_id as varchar(20)) + cast(r.client_id as varchar(20))
from client r)
But I can't seem to figure out how to use CAST to find duplicate records in
a single table. Should I be doing this a different way?
Any and all help would be appreciated.
JohnSELECT company_id, client_id, count(*) as Duplicates
FROM Company
GROUP BY company_id, client_id
HAVING COUNT(*) > 1
This works great with one column or ten columns.
Roy Harvey
Beacon Falls, CT
On Tue, 30 Oct 2007 12:56:05 -0700, John Steen
<moderndads(nospam)@.hotmail.com> wrote:
>I'm trying to find duplicate records in a table. I know how to do that using
>a single column value, but I need to do it using the combined values of two
>columns, company_id and client_id. There multiple are records with the same
>company_id value (e.g., 123) and multiple records with the same department_id
>value (e.g., 456). But there should be only one record with a company_id
>value of 123 and a department_id value of 456.
>I've searched for data in one table that is not in another table on these
>two columns using CAST to concatenate the values:
>select * from company c
>where cast(c.company_id as varchar(20)) + cast(c.client_id as varchar(20))
>not in
>(select cast(r.company_id as varchar(20)) + cast(r.client_id as varchar(20))
>from client r)
>But I can't seem to figure out how to use CAST to find duplicate records in
>a single table. Should I be doing this a different way?
>Any and all help would be appreciated.
>John|||Thank you, Roy! It worked perfectly.
John
"Roy Harvey (SQL Server MVP)" wrote:
> SELECT company_id, client_id, count(*) as Duplicates
> FROM Company
> GROUP BY company_id, client_id
> HAVING COUNT(*) > 1
> This works great with one column or ten columns.
> Roy Harvey
> Beacon Falls, CT
> On Tue, 30 Oct 2007 12:56:05 -0700, John Steen
> <moderndads(nospam)@.hotmail.com> wrote:
> >I'm trying to find duplicate records in a table. I know how to do that using
> >a single column value, but I need to do it using the combined values of two
> >columns, company_id and client_id. There multiple are records with the same
> >company_id value (e.g., 123) and multiple records with the same department_id
> >value (e.g., 456). But there should be only one record with a company_id
> >value of 123 and a department_id value of 456.
> >
> >I've searched for data in one table that is not in another table on these
> >two columns using CAST to concatenate the values:
> >
> >select * from company c
> >where cast(c.company_id as varchar(20)) + cast(c.client_id as varchar(20))
> >not in
> >(select cast(r.company_id as varchar(20)) + cast(r.client_id as varchar(20))
> >from client r)
> >
> >But I can't seem to figure out how to use CAST to find duplicate records in
> >a single table. Should I be doing this a different way?
> >
> >Any and all help would be appreciated.
> >
> >John
>

Finding duplicate records

I want to clean out our database. We have alot of duplicate records. I want
to run a query on email address to find all the remove all but one of the
duplicate records. I am only working with one table. Can someone help?
TIA
TRD
Check the following articles - they will walk you through
the process:
How to remove duplicate rows from a table in SQL Server
http://support.microsoft.com/?id=139444
How do I remove duplicates from a table?
http://databases.aspfaq.com/database...m-a-table.html
-Sue
On Wed, 27 Sep 2006 17:02:28 -0400, "TRD"
<tdejohnx2@.hotmail.com> wrote:

>I want to clean out our database. We have alot of duplicate records. I want
>to run a query on email address to find all the remove all but one of the
>duplicate records. I am only working with one table. Can someone help?
>TIA
>TRD
>
sql

Finding duplicate records

I want to clean out our database. We have alot of duplicate records. I want
to run a query on email address to find all the remove all but one of the
duplicate records. I am only working with one table. Can someone help?
TIA
TRDCheck the following articles - they will walk you through
the process:
How to remove duplicate rows from a table in SQL Server
http://support.microsoft.com/?id=139444
How do I remove duplicates from a table?
[url]http://databases.aspfaq.com/database/how-do-i-remove-duplicates-from-a-table.html[
/url]
-Sue
On Wed, 27 Sep 2006 17:02:28 -0400, "TRD"
<tdejohnx2@.hotmail.com> wrote:

>I want to clean out our database. We have alot of duplicate records. I want
>to run a query on email address to find all the remove all but one of the
>duplicate records. I am only working with one table. Can someone help?
>TIA
>TRD
>

Finding Duplicate Records

CREATE TABLE KeyLetter(
KeyNum int,
Letter char(1))
INSERT INTO KeyLetter VALUES(111111, 'a')
INSERT INTO KeyLetter VALUES(111111, 'b')
INSERT INTO KeyLetter VALUES(111111, 'c')
INSERT INTO KeyLetter VALUES(222222, 'a')
INSERT INTO KeyLetter VALUES(222222, 'b')
INSERT INTO KeyLetter VALUES(222222, 'c')
INSERT INTO KeyLetter VALUES(111111, 'a')
INSERT INTO KeyLetter VALUES(222222, 'a')
Given the above records, I have written the following to identify duplicate
records, but in my real life situation, I have a table with far more columns
to evaluate for duplication, and I was wondering if there is a better way to
identify duplicates such as this, without performing concatenation?
SELECT Cast(KeyNum as varchar(6)) + Letter),
COUNT(Cast(KeyNum as varchar(6)) + Letter))
FROM KeyLetter
GROUP BY Cast(KeyNum as varchar(6)) + Letter)
HAVING COUNT(Cast(KeyNum as varchar(6)) + Letter)) > 1
--
Message posted via http://www.sqlmonster.comUmm, anything wrong with this:
select KeyNum, Letter, count(*)
from KeyLetter
group by KeyNum, Letter
having count(*)>1
MC
"cbrichards via SQLMonster.com" <u3288@.uwe> wrote in message
news:7989a4847c279@.uwe...
> CREATE TABLE KeyLetter(
> KeyNum int,
> Letter char(1))
> INSERT INTO KeyLetter VALUES(111111, 'a')
> INSERT INTO KeyLetter VALUES(111111, 'b')
> INSERT INTO KeyLetter VALUES(111111, 'c')
> INSERT INTO KeyLetter VALUES(222222, 'a')
> INSERT INTO KeyLetter VALUES(222222, 'b')
> INSERT INTO KeyLetter VALUES(222222, 'c')
> INSERT INTO KeyLetter VALUES(111111, 'a')
> INSERT INTO KeyLetter VALUES(222222, 'a')
> Given the above records, I have written the following to identify
> duplicate
> records, but in my real life situation, I have a table with far more
> columns
> to evaluate for duplication, and I was wondering if there is a better way
> to
> identify duplicates such as this, without performing concatenation?
> SELECT Cast(KeyNum as varchar(6)) + Letter),
> COUNT(Cast(KeyNum as varchar(6)) + Letter))
> FROM KeyLetter
> GROUP BY Cast(KeyNum as varchar(6)) + Letter)
> HAVING COUNT(Cast(KeyNum as varchar(6)) + Letter)) > 1
> --
> Message posted via http://www.sqlmonster.com
>

Finding duplicate records

I want to clean out our database. We have alot of duplicate records. I want
to run a query on email address to find all the remove all but one of the
duplicate records. I am only working with one table. Can someone help?
TIA
TRDCheck the following articles - they will walk you through
the process:
How to remove duplicate rows from a table in SQL Server
http://support.microsoft.com/?id=139444
How do I remove duplicates from a table?
http://databases.aspfaq.com/database/how-do-i-remove-duplicates-from-a-table.html
-Sue
On Wed, 27 Sep 2006 17:02:28 -0400, "TRD"
<tdejohnx2@.hotmail.com> wrote:
>I want to clean out our database. We have alot of duplicate records. I want
>to run a query on email address to find all the remove all but one of the
>duplicate records. I am only working with one table. Can someone help?
>TIA
>TRD
>

Finding Duplicate Names in a Simple Table

This is probably so easy it will be stupid but I cannot get anything to
work.
I need to find, from one table, all duplicate names. I pull the entire list
like this:
SELECT ID, First_Name + ' ' + Last_Name, Date_of_Birth
FROM Attendants
What I need from this is a list where the names of the clients appear more
than once. I know there has to be an easy way to do this. Any quick help'SELECT First_Name,last_Name, count(*)
FROM Attendants
group by first_name, last_name
having count(*) > 1
Geoff Chovaz
MCTS: SQL Server 2005
MCITP: Database Administrator
MCITP: Database Developer
"JOHN HARRIS" <harris1113@.fake.com> wrote in message
news:43D97106-6DC7-495F-868A-109B4738A688@.microsoft.com...
> This is probably so easy it will be stupid but I cannot get anything to
> work.
> I need to find, from one table, all duplicate names. I pull the entire
> list like this:
> SELECT ID, First_Name + ' ' + Last_Name, Date_of_Birth
> FROM Attendants
> What I need from this is a list where the names of the clients appear more
> than once. I know there has to be an easy way to do this. Any quick help'|||SELECT ID, RTRIM(a.First_Name) + ' ' + RTRIM(a.Last_Name) , Date_of_Birth
FROM Attendants AS a
WHERE EXISTS (SELECT RTRIM(b.First_Name) + ' ' + RTRIM(b.Last_Name)
FROM Attendants AS b
WHERE a.First_Name = b.First_Name and
a.Last_Name = b.Last_Name
GROUP BY RTRIM(b.First_Name) + ' ' +
RTRIM(b.Last_Name)
HAVING COUNT(*) > 1 )
ORDER BY RTRIM(a.First_Name) + ' ' + RTRIM(a.Last_Name)
Gail Erickson [MS]
SQL Server Documentation Team
This posting is provided "AS IS" with no warranties, and confers no rights
Download the latest version of Books Online from
http://technet.microsoft.com/en-us/sqlserver/bb428874.aspx
"JOHN HARRIS" <harris1113@.fake.com> wrote in message
news:43D97106-6DC7-495F-868A-109B4738A688@.microsoft.com...
> This is probably so easy it will be stupid but I cannot get anything to
> work.
> I need to find, from one table, all duplicate names. I pull the entire
> list like this:
> SELECT ID, First_Name + ' ' + Last_Name, Date_of_Birth
> FROM Attendants
> What I need from this is a list where the names of the clients appear more
> than once. I know there has to be an easy way to do this. Any quick help'|||Thanks Geoff
Works GREAT!
"Geoff Chovaz" <chovaz@.nospam.nospam> wrote in message
news:u0lwu%237%23HHA.1164@.TK2MSFTNGP02.phx.gbl...
> SELECT First_Name,last_Name, count(*)
> FROM Attendants
> group by first_name, last_name
> having count(*) > 1
>
> --
> Geoff Chovaz
> MCTS: SQL Server 2005
> MCITP: Database Administrator
> MCITP: Database Developer
>
> "JOHN HARRIS" <harris1113@.fake.com> wrote in message
> news:43D97106-6DC7-495F-868A-109B4738A688@.microsoft.com...
>> This is probably so easy it will be stupid but I cannot get anything to
>> work.
>> I need to find, from one table, all duplicate names. I pull the entire
>> list like this:
>> SELECT ID, First_Name + ' ' + Last_Name, Date_of_Birth
>> FROM Attendants
>> What I need from this is a list where the names of the clients appear
>> more than once. I know there has to be an easy way to do this. Any quick
>> help'
>

Finding Duplicate Foreign Keys

Hi

i tried the following query and able to get the list of foreign keys with column names as well as referred tables and referenced column

select parent_column_id as 'child Column',object_name(constraint_object_id)as 'FK Name',object_name(parent_object_id) as 'parent table',name,object_name(referenced_object_id)as 'referenced table',referenced_column_id
from sys.foreign_key_columns inner join sys.columns on (parent_column_id = column_id and parent_object_id=object_id)
Order by object_name(parent_object_id) asc

but i am not able to get the fks created more than once on same column refering to same pk

Thanks in AdvanceIs this helpful?

Field1 has the following values;
1, 2, 3, 3, 4, 5, 6, 6, 6, 7, 8, 9
duplicates highlighted in bold

The following SQL will pick out duplicates only

SELECT Field1
FROM MyTable
GROUP BY Field1
HAVING Count(*) > 1

Field1
--
3
6

:cool:|||I am not sure what is the exact problem, but I think 'Information_Schema' might help. It contains number of views that can be used to retrive meta data.

E.g.
Select * from information_Schema.TABLE_CONSTRAINTS|||here's one that I use in SqlSpec, it may be more than you need though:

select
quotename(su2.name) collate Latin1_General_CI_AS + N'.' collate Latin1_General_CI_AS + quotename(so2.name) collate Latin1_General_CI_AS as parent
,quotename(su1.name) collate Latin1_General_CI_AS + N'.' collate Latin1_General_CI_AS + quotename(so1.name) collate Latin1_General_CI_AS as name
,so1.name as shortname ,sc2.name as colName
,'foreign key from ' + quotename(su2.name) + N'.' + quotename(so2.name) + ' to ' + quotename(su3.name) + N'.' + quotename(so3.name) as description
,quotename(su3.name) collate Latin1_General_CI_AS + N'.' collate Latin1_General_CI_AS + quotename(so3.name) collate Latin1_General_CI_AS as refTable
,sc3.name as refColumn
,ObjectProperty(sf.constid, 'CnstIsDisabled') as isDisabled
,ObjectProperty(sf.constid, 'CnstIsNotRepl') as isNotForReplication
,ObjectProperty(sf.constid, 'CnstIsDeleteCascade') as isDeleteCascade
,ObjectProperty(sf.constid, 'CnstIsUpdateCascade') as isUpdateCascade
,so1.object_id as id from sysforeignkeys sf
join sys.objects so1 on sf.constid=so1.object_id
join sys.objects so2 on sf.fkeyid=so2.object_id
join sys.objects so3 on sf.rkeyid=so3.object_id
join sys.schemas su1 on su1.schema_id=so1.schema_id
join sys.schemas su2 on su2.schema_id=so2.schema_id
join sys.schemas su3 on su3.schema_id=so3.schema_id
join syscolumns sc2 on sc2.colid=sf.fkey and sc2.id=so2.object_id
join syscolumns sc3 on sc3.colid=sf.rkey and sc3.id=so3.object_id|||Thanks You ALL for your reply.

i got that using
INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc and INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE

with CTEsql

Finding duplicate entries in a "smart" way - by comparing first two words

What is the best way to compare two entries in a single table where
the two fields are "almost" the same?

For example, I would like to write a query that would compare the
first two words in a "company" field. If they are the same, I would
like to output them.

For example, "20th Century" and "20th Century Fox" in the company
field would be the same.

How do I do this? Do I need to use a cursor? Is it as simple as using
"Like?"Here's an example:

CREATE TABLE Sometable (company VARCHAR(20) PRIMARY KEY)

INSERT INTO Sometable VALUES ('20th Century')
INSERT INTO Sometable VALUES ('20th Century Fox')
INSERT INTO Sometable VALUES ('Something Else')

SELECT company
FROM Sometable AS S
JOIN
(SELECT first_two_words
FROM
(SELECT company,
LEFT(company,
CHARINDEX(' ',company+' ',CHARINDEX(' ',company)+1))
AS first_two_words
FROM Sometable) AS X
GROUP BY first_two_words
HAVING COUNT(*)>1) AS T
ON T.first_two_words =
LEFT(company,CHARINDEX(' ',company+' ',CHARINDEX(' ',company)+1))

--
David Portas
----
Please reply only to the newsgroup
--|||>> a query that would compare the first two words in a "company"
field [sic]. If they are the same, I would like to output them. <<

Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, datatypes, etc. in
your schema are. Sample data is also a good idea, along with clear
specifications.

CREATE TABLE Foobar
(foo_key INTEGER NOT NULL PRIMARY KEY,
company CHAR(35) NOT NULL,
...);

SELECT F1.foo_key, F2.foo_key
FROM Foobar AS F1, Foobar AS F2
WHERE CHARINDEX(F1.company, F2.company) = 1
AND F1.foo_key < F2.foo_key;

Finding duplicate based on records in same table

I need a help with a sql...
Is there a way to find a duplicate value based on different records
inside the same table?
I got a scenario, where I'm getting same ID with different date...
I need to get rid of one with earlier date...
Thanks in advance...
paulcasso@.gmail.com,
- Can you be more explicit and post some DDl, sample data and expected result?
- Which version are you using 2000 or 2005?
;with cte
as
(
select
[date], [id],
row_number () over(partition by [id] order by [date] DESC) as rn
from
dbo.t1
)
delete cte
where rn > 1;
AMB
"paulcasso@.gmail.com" wrote:

> I need a help with a sql...
> Is there a way to find a duplicate value based on different records
> inside the same table?
> I got a scenario, where I'm getting same ID with different date...
> I need to get rid of one with earlier date...
> Thanks in advance...
>
|||Take a look at this example:
--SQL Begins--
use tempdb
go
create table #tablename(
id_field int not null,
dt_field datetime not null
)
go
insert #tablename values(1, '1/1/2007')
insert #tablename values(2, '1/2/2007')
insert #tablename values(2, '1/3/2007')
insert #tablename values(3, '1/4/2007')
insert #tablename values(2, '1/5/2007')
insert #tablename values(3, '1/6/2007')
insert #tablename values(1, '1/7/2007')
go
-- deletes all but the most recent record for each id
delete a
from #tablename a
left join (select id_field, max(dt_field) dt
from #tablename
group by id_field) b
on a.id_field = b.id_field and a.dt_field = b.dt
where b.id_field is null
go
select * from #tablename
go
drop table #tablename
go
--SQL Ends--
On Apr 11, 2:20 pm, "paulca...@.gmail.com" <paulca...@.gmail.com> wrote:
> I need a help with a sql...
> Is there a way to find a duplicate value based on different records
> inside the same table?
> I got a scenario, where I'm getting same ID with different date...
> I need to get rid of one with earlier date...
> Thanks in advance...
|||Thank you for your help.,
but where does value dt coming from?
...on a.id_field = b.id_field and a.dt_field = b.dt ...
|||On Apr 11, 4:21 pm, "paulca...@.gmail.com" <paulca...@.gmail.com> wrote:
> Thank you for your help.,
> but where does value dt coming from?
> ...on a.id_field = b.id_field and a.dt_field = b.dt ...

>From a previous post:
left join ( select id_field, max(dt_field) dt
dt is an alias for max(dt_field)
Cheers,
Jason Lepack
|||Hi
http://dimantdatabasesolutions.blogspot.com/2007/02/dealing-with-duplicates.html
<paulcasso@.gmail.com> wrote in message
news:1176315620.878773.240790@.y5g2000hsa.googlegro ups.com...
>I need a help with a sql...
> Is there a way to find a duplicate value based on different records
> inside the same table?
> I got a scenario, where I'm getting same ID with different date...
> I need to get rid of one with earlier date...
> Thanks in advance...
>

Finding duplicate based on records in same table

I need a help with a sql...
Is there a way to find a duplicate value based on different records
inside the same table?
I got a scenario, where I'm getting same ID with different date...
I need to get rid of one with earlier date...
Thanks in advance...paulcasso@.gmail.com,
- Can you be more explicit and post some DDl, sample data and expected resul
t?
- Which version are you using 2000 or 2005?
;with cte
as
(
select
[date], [id],
row_number () over(partition by [id] order by [date] DESC) as rn
from
dbo.t1
)
delete cte
where rn > 1;
AMB
"paulcasso@.gmail.com" wrote:

> I need a help with a sql...
> Is there a way to find a duplicate value based on different records
> inside the same table?
> I got a scenario, where I'm getting same ID with different date...
> I need to get rid of one with earlier date...
> Thanks in advance...
>|||Take a look at this example:
--SQL Begins--
use tempdb
go
create table #tablename(
id_field int not null,
dt_field datetime not null
)
go
insert #tablename values(1, '1/1/2007')
insert #tablename values(2, '1/2/2007')
insert #tablename values(2, '1/3/2007')
insert #tablename values(3, '1/4/2007')
insert #tablename values(2, '1/5/2007')
insert #tablename values(3, '1/6/2007')
insert #tablename values(1, '1/7/2007')
go
-- deletes all but the most recent record for each id
delete a
from #tablename a
left join ( select id_field, max(dt_field) dt
from #tablename
group by id_field) b
on a.id_field = b.id_field and a.dt_field = b.dt
where b.id_field is null
go
select * from #tablename
go
drop table #tablename
go
--SQL Ends--
On Apr 11, 2:20 pm, "paulca...@.gmail.com" <paulca...@.gmail.com> wrote:
> I need a help with a sql...
> Is there a way to find a duplicate value based on different records
> inside the same table?
> I got a scenario, where I'm getting same ID with different date...
> I need to get rid of one with earlier date...
> Thanks in advance...|||Thank you for your help.,
but where does value dt coming from?
...on a.id_field = b.id_field and a.dt_field = b.dt ...|||On Apr 11, 4:21 pm, "paulca...@.gmail.com" <paulca...@.gmail.com> wrote:
> Thank you for your help.,
> but where does value dt coming from?
> ...on a.id_field = b.id_field and a.dt_field = b.dt ...

>From a previous post:
left join ( select id_field, max(dt_field) dt
dt is an alias for max(dt_field)
Cheers,
Jason Lepack|||Hi
[url]http://dimantdatabasesolutions.blogspot.com/2007/02/dealing-with-duplicates.html[/
url]
<paulcasso@.gmail.com> wrote in message
news:1176315620.878773.240790@.y5g2000hsa.googlegroups.com...
>I need a help with a sql...
> Is there a way to find a duplicate value based on different records
> inside the same table?
> I got a scenario, where I'm getting same ID with different date...
> I need to get rid of one with earlier date...
> Thanks in advance...
>

Finding duplicate based on records in same table

I need a help with a sql...
Is there a way to find a duplicate value based on different records
inside the same table?
I got a scenario, where I'm getting same ID with different date...
I need to get rid of one with earlier date...
Thanks in advance...paulcasso@.gmail.com,
- Can you be more explicit and post some DDl, sample data and expected result?
- Which version are you using 2000 or 2005?
;with cte
as
(
select
[date], [id],
row_number () over(partition by [id] order by [date] DESC) as rn
from
dbo.t1
)
delete cte
where rn > 1;
AMB
"paulcasso@.gmail.com" wrote:
> I need a help with a sql...
> Is there a way to find a duplicate value based on different records
> inside the same table?
> I got a scenario, where I'm getting same ID with different date...
> I need to get rid of one with earlier date...
> Thanks in advance...
>|||Take a look at this example:
--SQL Begins--
use tempdb
go
create table #tablename(
id_field int not null,
dt_field datetime not null
)
go
insert #tablename values(1, '1/1/2007')
insert #tablename values(2, '1/2/2007')
insert #tablename values(2, '1/3/2007')
insert #tablename values(3, '1/4/2007')
insert #tablename values(2, '1/5/2007')
insert #tablename values(3, '1/6/2007')
insert #tablename values(1, '1/7/2007')
go
-- deletes all but the most recent record for each id
delete a
from #tablename a
left join ( select id_field, max(dt_field) dt
from #tablename
group by id_field) b
on a.id_field = b.id_field and a.dt_field = b.dt
where b.id_field is null
go
select * from #tablename
go
drop table #tablename
go
--SQL Ends--
On Apr 11, 2:20 pm, "paulca...@.gmail.com" <paulca...@.gmail.com> wrote:
> I need a help with a sql...
> Is there a way to find a duplicate value based on different records
> inside the same table?
> I got a scenario, where I'm getting same ID with different date...
> I need to get rid of one with earlier date...
> Thanks in advance...|||Thank you for your help.,
but where does value dt coming from?
...on a.id_field = b.id_field and a.dt_field = b.dt ...|||On Apr 11, 4:21 pm, "paulca...@.gmail.com" <paulca...@.gmail.com> wrote:
> Thank you for your help.,
> but where does value dt coming from?
> ...on a.id_field = b.id_field and a.dt_field = b.dt ...
>From a previous post:
left join ( select id_field, max(dt_field) dt
dt is an alias for max(dt_field)
Cheers,
Jason Lepack|||Hi
http://dimantdatabasesolutions.blogspot.com/2007/02/dealing-with-duplicates.html
<paulcasso@.gmail.com> wrote in message
news:1176315620.878773.240790@.y5g2000hsa.googlegroups.com...
>I need a help with a sql...
> Is there a way to find a duplicate value based on different records
> inside the same table?
> I got a scenario, where I'm getting same ID with different date...
> I need to get rid of one with earlier date...
> Thanks in advance...
>

Finding displaying and deleteing dulpicate records

HI all,
I am trying to find duplicate records ina database that I have taken over. I
am using group bys and count() to find the sups.
I also realise that we could have a problem with misspellings and hterefore
not have a duplicate as in Joe Soape, Jo Soap, Joe Sope Joe Soap. These
couldpossible be sups of the same. I would terefore assume to allow for user
to decide what is dups and what to keep.
Are there any other more acceptable ways and means of doing this.
THis a a sample of what I use
SELECT Name, address1, address2,city, COUNT(*) as numbofdups FROM reinsurer
GROUP BY Name, address1, address2, city HAVING COUNT(*) > 1
Thanks
RobertIf you are using SQL 2005 developer / enterprise edition you could use the
Fuzzy Grouping / Term Extraction transformations in Integration Services and
work it out through that.
Aside from that you need to role your own in T-SQL, there are functions like
SOUNDEX that help to an extent but aren't that brilliant.
Tony.
Tony Rogerson
SQL Server MVP
http://sqlserverfaq.com - free video tutorials
"Robert Bravery" <me@.u.com> wrote in message
news:u0uuiXbbGHA.3352@.TK2MSFTNGP03.phx.gbl...
> HI all,
> I am trying to find duplicate records ina database that I have taken over.
> I
> am using group bys and count() to find the sups.
> I also realise that we could have a problem with misspellings and
> hterefore
> not have a duplicate as in Joe Soape, Jo Soap, Joe Sope Joe Soap. These
> couldpossible be sups of the same. I would terefore assume to allow for
> user
> to decide what is dups and what to keep.
> Are there any other more acceptable ways and means of doing this.
> THis a a sample of what I use
> SELECT Name, address1, address2,city, COUNT(*) as numbofdups FROM
> reinsurer
> GROUP BY Name, address1, address2, city HAVING COUNT(*) > 1
> Thanks
> Robert
>
>

Monday, March 26, 2012

finding and removing duplicates rows

Say I had a table called 'RAS', with 4 columns:
Date, Time, UserName, GroupName
How can I check for any duplicate rows based on all those columns
combined.
And once they are found, can the duplicates be deleted?
It would be nice to see 2 solutions for learning purposes... one
statement just to show the duplicates, and another separate statement
to find and delete the duplicates.
Thanks in advance!
StephenCREATE TABLE #Demo (
idNo int identity(1,1),
colA int,
colB int
)
INSERT INTO #Demo(colA,colB) VALUES (1,6)
INSERT INTO #Demo(colA,colB) VALUES (1,6)
INSERT INTO #Demo(colA,colB) VALUES (2,4)
INSERT INTO #Demo(colA,colB) VALUES (3,3)
INSERT INTO #Demo(colA,colB) VALUES (4,2)
INSERT INTO #Demo(colA,colB) VALUES (3,3)
INSERT INTO #Demo(colA,colB) VALUES (5,1)
INSERT INTO #Demo(colA,colB) VALUES (8,1)
PRINT 'Table'
SELECT * FROM #Demo
PRINT 'Duplicates in Table'
SELECT * FROM #Demo
WHERE idNo IN
(SELECT B.idNo
FROM #Demo A JOIN #Demo B
ON A.idNo <> B.idNo
AND A.colA = B.colA
AND A.colB = B.colB)
PRINT 'Duplicates to Delete'
SELECT * FROM #Demo
WHERE idNo IN
(SELECT B.idNo
FROM #Demo A JOIN #Demo B
ON A.idNo < B.idNo -- < this time, not <>
AND A.colA = B.colA
AND A.colB = B.colB)
DELETE FROM #Demo
WHERE idNo IN
(SELECT B.idNo
FROM #Demo A JOIN #Demo B
ON A.idNo < B.idNo -- < this time, not <>
AND A.colA = B.colA
AND A.colB = B.colB)
PRINT 'Cleaned-up Table'
SELECT * FROM #Demo
DROP TABLE #Demo
"Stephen" <facetoface72@.hotmail.com> wrote in message
news:d6e21d67.0403160535.5ac3afa7@.posting.google.com...
> Say I had a table called 'RAS', with 4 columns:
> Date, Time, UserName, GroupName
> How can I check for any duplicate rows based on all those columns
> combined.
> And once they are found, can the duplicates be deleted?
> It would be nice to see 2 solutions for learning purposes... one
> statement just to show the duplicates, and another separate statement
> to find and delete the duplicates.
> Thanks in advance!
> Stephen|||Thanks for the message, it definately helped solve a lot of issues.
Is there a way to do everything mentioned, but without having a unique
key for each record?
"Uri Dimant" <urid@.iscar.co.il> wrote in message news:<#6uUEw1CEHA.1452@.TK2MSFTNGP09.phx.gb
l>...
> CREATE TABLE #Demo (
> idNo int identity(1,1),
> colA int,
> colB int
> )
> INSERT INTO #Demo(colA,colB) VALUES (1,6)
> INSERT INTO #Demo(colA,colB) VALUES (1,6)
> INSERT INTO #Demo(colA,colB) VALUES (2,4)
> INSERT INTO #Demo(colA,colB) VALUES (3,3)
> INSERT INTO #Demo(colA,colB) VALUES (4,2)
> INSERT INTO #Demo(colA,colB) VALUES (3,3)
> INSERT INTO #Demo(colA,colB) VALUES (5,1)
> INSERT INTO #Demo(colA,colB) VALUES (8,1)
> PRINT 'Table'
> SELECT * FROM #Demo
> PRINT 'Duplicates in Table'
> SELECT * FROM #Demo
> WHERE idNo IN
> (SELECT B.idNo
> FROM #Demo A JOIN #Demo B
> ON A.idNo <> B.idNo
> AND A.colA = B.colA
> AND A.colB = B.colB)
> PRINT 'Duplicates to Delete'
> SELECT * FROM #Demo
> WHERE idNo IN
> (SELECT B.idNo
> FROM #Demo A JOIN #Demo B
> ON A.idNo < B.idNo -- < this time, not <>
> AND A.colA = B.colA
> AND A.colB = B.colB)
> DELETE FROM #Demo
> WHERE idNo IN
> (SELECT B.idNo
> FROM #Demo A JOIN #Demo B
> ON A.idNo < B.idNo -- < this time, not <>
> AND A.colA = B.colA
> AND A.colB = B.colB)
> PRINT 'Cleaned-up Table'
> SELECT * FROM #Demo
> DROP TABLE #Demo
> "Stephen" <facetoface72@.hotmail.com> wrote in message
> news:d6e21d67.0403160535.5ac3afa7@.posting.google.com...|||On 17 Mar 2004 05:24:49 -0800, Stephen wrote:

>Thanks for the message, it definately helped solve a lot of issues.
>Is there a way to do everything mentioned, but without having a unique
>key for each record?
No. If all columns in two rows are the same, there's no way to tell
them apart. Any where clause will either select both rows or none of
them. If you want to eliminate duplicates, you have to add an IDENTITY
or GUID column, or else you will delete both rows instead of one of
the duplicates.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||DELETE *
FROM table3
WHERE sno not in (SELECT min(sno)
FROM table3
GROUP BY fd1, fd2, fd3);

Finding and Deleting Duplicate Records

I have a table that has many duplicate records and I need to delete all but
one of the duplicate records. Is there a way I can identify these records
with a sql statement?
Example record:
(Comment_id, site_id, date_entered, user_id, comment)
6978, AL28492, 07/17/2005, 408, No longer a Franchise, replace w/
non-franchise signage.
8161, AL28492, 08/15/2005, 408, No longer a Franchise, replace w/
non-franchise signage.Another thing to consider. The comment field is a text data type.
"Mike Collins" wrote:
> I have a table that has many duplicate records and I need to delete all but
> one of the duplicate records. Is there a way I can identify these records
> with a sql statement?
> Example record:
> (Comment_id, site_id, date_entered, user_id, comment)
> 6978, AL28492, 07/17/2005, 408, No longer a Franchise, replace w/
> non-franchise signage.
> 8161, AL28492, 08/15/2005, 408, No longer a Franchise, replace w/
> non-franchise signage.|||Mike,
Depending on the size of the comment field you could do something like this:
SELECT DISTINCT(CONVERT(VARCHAR(8000),COMMENT))
FROM <TABLENAME>
HTH
Jerry
"Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
news:BB36D700-DB54-43BC-8734-70634BA8B7ED@.microsoft.com...
>I have a table that has many duplicate records and I need to delete all but
> one of the duplicate records. Is there a way I can identify these records
> with a sql statement?
> Example record:
> (Comment_id, site_id, date_entered, user_id, comment)
> 6978, AL28492, 07/17/2005, 408, No longer a Franchise, replace w/
> non-franchise signage.
> 8161, AL28492, 08/15/2005, 408, No longer a Franchise, replace w/
> non-franchise signage.|||Thanks, but how would I delete all the other records that I don't want?
"Jerry Spivey" wrote:
> Mike,
> Depending on the size of the comment field you could do something like this:
> SELECT DISTINCT(CONVERT(VARCHAR(8000),COMMENT))
> FROM <TABLENAME>
> HTH
> Jerry
> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
> news:BB36D700-DB54-43BC-8734-70634BA8B7ED@.microsoft.com...
> >I have a table that has many duplicate records and I need to delete all but
> > one of the duplicate records. Is there a way I can identify these records
> > with a sql statement?
> >
> > Example record:
> > (Comment_id, site_id, date_entered, user_id, comment)
> > 6978, AL28492, 07/17/2005, 408, No longer a Franchise, replace w/
> > non-franchise signage.
> >
> > 8161, AL28492, 08/15/2005, 408, No longer a Franchise, replace w/
> > non-franchise signage.
>
>|||Mike,
Can you post the DDL for the table(s) so I don't have to guess about the
keys?
Thanks
Jerry
"Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
news:62C48B0E-9B92-4036-9A23-3244889684A9@.microsoft.com...
> Thanks, but how would I delete all the other records that I don't want?
> "Jerry Spivey" wrote:
>> Mike,
>> Depending on the size of the comment field you could do something like
>> this:
>> SELECT DISTINCT(CONVERT(VARCHAR(8000),COMMENT))
>> FROM <TABLENAME>
>> HTH
>> Jerry
>> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
>> news:BB36D700-DB54-43BC-8734-70634BA8B7ED@.microsoft.com...
>> >I have a table that has many duplicate records and I need to delete all
>> >but
>> > one of the duplicate records. Is there a way I can identify these
>> > records
>> > with a sql statement?
>> >
>> > Example record:
>> > (Comment_id, site_id, date_entered, user_id, comment)
>> > 6978, AL28492, 07/17/2005, 408, No longer a Franchise, replace w/
>> > non-franchise signage.
>> >
>> > 8161, AL28492, 08/15/2005, 408, No longer a Franchise, replace w/
>> > non-franchise signage.
>>|||CREATE TABLE [dbo].[StatusRptComments] (
[Comment_id] [int] IDENTITY (1, 1) NOT NULL ,
[Site_no] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[DateEntered] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[AdminID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Comment] [varchar] (2500) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Suellen_Comment] [varchar] (2500) COLLATE SQL_Latin1_General_CP1_CI_AS
NULL ,
[comment_type] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[StatusRptComments] WITH NOCHECK ADD
CONSTRAINT [PK_StatusRptComments] PRIMARY KEY CLUSTERED
(
[Comment_id]
) WITH FILLFACTOR = 90 ON [PRIMARY]
GO
"Jerry Spivey" wrote:
> Mike,
> Can you post the DDL for the table(s) so I don't have to guess about the
> keys?
> Thanks
> Jerry
> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
> news:62C48B0E-9B92-4036-9A23-3244889684A9@.microsoft.com...
> > Thanks, but how would I delete all the other records that I don't want?
> >
> > "Jerry Spivey" wrote:
> >
> >> Mike,
> >>
> >> Depending on the size of the comment field you could do something like
> >> this:
> >>
> >> SELECT DISTINCT(CONVERT(VARCHAR(8000),COMMENT))
> >> FROM <TABLENAME>
> >>
> >> HTH
> >>
> >> Jerry
> >> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
> >> news:BB36D700-DB54-43BC-8734-70634BA8B7ED@.microsoft.com...
> >> >I have a table that has many duplicate records and I need to delete all
> >> >but
> >> > one of the duplicate records. Is there a way I can identify these
> >> > records
> >> > with a sql statement?
> >> >
> >> > Example record:
> >> > (Comment_id, site_id, date_entered, user_id, comment)
> >> > 6978, AL28492, 07/17/2005, 408, No longer a Franchise, replace w/
> >> > non-franchise signage.
> >> >
> >> > 8161, AL28492, 08/15/2005, 408, No longer a Franchise, replace w/
> >> > non-franchise signage.
> >>
> >>
> >>
>
>|||Mike,
What criteria do you want for the DELETE? I.e., ok the comment column is
the same for x records but the other columns may be different i.e,
comment_id, dateentered, etc... Which one record do you want to retain?
Thanks
Jerry
"Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
news:5CB48CB2-7776-4C0A-BA97-D0ADDE8171C7@.microsoft.com...
> CREATE TABLE [dbo].[StatusRptComments] (
> [Comment_id] [int] IDENTITY (1, 1) NOT NULL ,
> [Site_no] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [DateEntered] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [AdminID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [Comment] [varchar] (2500) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [Suellen_Comment] [varchar] (2500) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL ,
> [comment_type] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
> ) ON [PRIMARY]
> GO
> ALTER TABLE [dbo].[StatusRptComments] WITH NOCHECK ADD
> CONSTRAINT [PK_StatusRptComments] PRIMARY KEY CLUSTERED
> (
> [Comment_id]
> ) WITH FILLFACTOR = 90 ON [PRIMARY]
> GO
> "Jerry Spivey" wrote:
>> Mike,
>> Can you post the DDL for the table(s) so I don't have to guess about the
>> keys?
>> Thanks
>> Jerry
>> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
>> news:62C48B0E-9B92-4036-9A23-3244889684A9@.microsoft.com...
>> > Thanks, but how would I delete all the other records that I don't want?
>> >
>> > "Jerry Spivey" wrote:
>> >
>> >> Mike,
>> >>
>> >> Depending on the size of the comment field you could do something like
>> >> this:
>> >>
>> >> SELECT DISTINCT(CONVERT(VARCHAR(8000),COMMENT))
>> >> FROM <TABLENAME>
>> >>
>> >> HTH
>> >>
>> >> Jerry
>> >> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in
>> >> message
>> >> news:BB36D700-DB54-43BC-8734-70634BA8B7ED@.microsoft.com...
>> >> >I have a table that has many duplicate records and I need to delete
>> >> >all
>> >> >but
>> >> > one of the duplicate records. Is there a way I can identify these
>> >> > records
>> >> > with a sql statement?
>> >> >
>> >> > Example record:
>> >> > (Comment_id, site_id, date_entered, user_id, comment)
>> >> > 6978, AL28492, 07/17/2005, 408, No longer a Franchise, replace w/
>> >> > non-franchise signage.
>> >> >
>> >> > 8161, AL28492, 08/15/2005, 408, No longer a Franchise, replace w/
>> >> > non-franchise signage.
>> >>
>> >>
>> >>
>>|||Sorry...I want to keep the entry with the latest date. Thanks for your
patience with my not providing complete details up front.
"Jerry Spivey" wrote:
> Mike,
> What criteria do you want for the DELETE? I.e., ok the comment column is
> the same for x records but the other columns may be different i.e,
> comment_id, dateentered, etc... Which one record do you want to retain?
> Thanks
> Jerry
>
> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
> news:5CB48CB2-7776-4C0A-BA97-D0ADDE8171C7@.microsoft.com...
> > CREATE TABLE [dbo].[StatusRptComments] (
> > [Comment_id] [int] IDENTITY (1, 1) NOT NULL ,
> > [Site_no] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [DateEntered] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [AdminID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [Comment] [varchar] (2500) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [Suellen_Comment] [varchar] (2500) COLLATE SQL_Latin1_General_CP1_CI_AS
> > NULL ,
> > [comment_type] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
> > ) ON [PRIMARY]
> > GO
> >
> > ALTER TABLE [dbo].[StatusRptComments] WITH NOCHECK ADD
> > CONSTRAINT [PK_StatusRptComments] PRIMARY KEY CLUSTERED
> > (
> > [Comment_id]
> > ) WITH FILLFACTOR = 90 ON [PRIMARY]
> > GO
> >
> > "Jerry Spivey" wrote:
> >
> >> Mike,
> >>
> >> Can you post the DDL for the table(s) so I don't have to guess about the
> >> keys?
> >>
> >> Thanks
> >>
> >> Jerry
> >> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
> >> news:62C48B0E-9B92-4036-9A23-3244889684A9@.microsoft.com...
> >> > Thanks, but how would I delete all the other records that I don't want?
> >> >
> >> > "Jerry Spivey" wrote:
> >> >
> >> >> Mike,
> >> >>
> >> >> Depending on the size of the comment field you could do something like
> >> >> this:
> >> >>
> >> >> SELECT DISTINCT(CONVERT(VARCHAR(8000),COMMENT))
> >> >> FROM <TABLENAME>
> >> >>
> >> >> HTH
> >> >>
> >> >> Jerry
> >> >> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in
> >> >> message
> >> >> news:BB36D700-DB54-43BC-8734-70634BA8B7ED@.microsoft.com...
> >> >> >I have a table that has many duplicate records and I need to delete
> >> >> >all
> >> >> >but
> >> >> > one of the duplicate records. Is there a way I can identify these
> >> >> > records
> >> >> > with a sql statement?
> >> >> >
> >> >> > Example record:
> >> >> > (Comment_id, site_id, date_entered, user_id, comment)
> >> >> > 6978, AL28492, 07/17/2005, 408, No longer a Franchise, replace w/
> >> >> > non-franchise signage.
> >> >> >
> >> >> > 8161, AL28492, 08/15/2005, 408, No longer a Franchise, replace w/
> >> >> > non-franchise signage.
> >> >>
> >> >>
> >> >>
> >>
> >>
> >>
>
>|||Mike,
Try:
BEGIN TRAN
DELETE #STATUSRPTCOMMENTS
FROM #STATUSRPTCOMMENTS S2
WHERE SITE_NO = S2.SITE_NO AND ADMINID = S2.ADMINID AND COMMENT = S2.COMMENT
AND S2.DATEENTERED <> (SELECT MAX(DATEENTERED) FROM #STATUSRPTCOMMENTS S1
WHERE S1.SITE_NO = S2.SITE_NO)
--ROLLBACK TRAN
--OR
--COMMIT TRAN
My mind is shot today so use this explicit tran and see if it works if not
roll back and tell me what is wrong and I'll look at it this weekend.
HTH
Jerry
"Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
news:6476D0E1-1054-429E-A456-35982AA76A29@.microsoft.com...
> Sorry...I want to keep the entry with the latest date. Thanks for your
> patience with my not providing complete details up front.
> "Jerry Spivey" wrote:
>> Mike,
>> What criteria do you want for the DELETE? I.e., ok the comment column is
>> the same for x records but the other columns may be different i.e,
>> comment_id, dateentered, etc... Which one record do you want to retain?
>> Thanks
>> Jerry
>>
>> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
>> news:5CB48CB2-7776-4C0A-BA97-D0ADDE8171C7@.microsoft.com...
>> > CREATE TABLE [dbo].[StatusRptComments] (
>> > [Comment_id] [int] IDENTITY (1, 1) NOT NULL ,
>> > [Site_no] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
>> > [DateEntered] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
>> > ,
>> > [AdminID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
>> > [Comment] [varchar] (2500) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
>> > [Suellen_Comment] [varchar] (2500) COLLATE SQL_Latin1_General_CP1_CI_AS
>> > NULL ,
>> > [comment_type] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
>> > ) ON [PRIMARY]
>> > GO
>> >
>> > ALTER TABLE [dbo].[StatusRptComments] WITH NOCHECK ADD
>> > CONSTRAINT [PK_StatusRptComments] PRIMARY KEY CLUSTERED
>> > (
>> > [Comment_id]
>> > ) WITH FILLFACTOR = 90 ON [PRIMARY]
>> > GO
>> >
>> > "Jerry Spivey" wrote:
>> >
>> >> Mike,
>> >>
>> >> Can you post the DDL for the table(s) so I don't have to guess about
>> >> the
>> >> keys?
>> >>
>> >> Thanks
>> >>
>> >> Jerry
>> >> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in
>> >> message
>> >> news:62C48B0E-9B92-4036-9A23-3244889684A9@.microsoft.com...
>> >> > Thanks, but how would I delete all the other records that I don't
>> >> > want?
>> >> >
>> >> > "Jerry Spivey" wrote:
>> >> >
>> >> >> Mike,
>> >> >>
>> >> >> Depending on the size of the comment field you could do something
>> >> >> like
>> >> >> this:
>> >> >>
>> >> >> SELECT DISTINCT(CONVERT(VARCHAR(8000),COMMENT))
>> >> >> FROM <TABLENAME>
>> >> >>
>> >> >> HTH
>> >> >>
>> >> >> Jerry
>> >> >> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in
>> >> >> message
>> >> >> news:BB36D700-DB54-43BC-8734-70634BA8B7ED@.microsoft.com...
>> >> >> >I have a table that has many duplicate records and I need to
>> >> >> >delete
>> >> >> >all
>> >> >> >but
>> >> >> > one of the duplicate records. Is there a way I can identify these
>> >> >> > records
>> >> >> > with a sql statement?
>> >> >> >
>> >> >> > Example record:
>> >> >> > (Comment_id, site_id, date_entered, user_id, comment)
>> >> >> > 6978, AL28492, 07/17/2005, 408, No longer a Franchise, replace w/
>> >> >> > non-franchise signage.
>> >> >> >
>> >> >> > 8161, AL28492, 08/15/2005, 408, No longer a Franchise, replace w/
>> >> >> > non-franchise signage.
>> >> >>
>> >> >>
>> >> >>
>> >>
>> >>
>> >>
>>|||That works great...thanks Jerry.
"Jerry Spivey" wrote:
> Mike,
> Try:
> BEGIN TRAN
> DELETE #STATUSRPTCOMMENTS
> FROM #STATUSRPTCOMMENTS S2
> WHERE SITE_NO = S2.SITE_NO AND ADMINID = S2.ADMINID AND COMMENT = S2.COMMENT
> AND S2.DATEENTERED <> (SELECT MAX(DATEENTERED) FROM #STATUSRPTCOMMENTS S1
> WHERE S1.SITE_NO = S2.SITE_NO)
> --ROLLBACK TRAN
> --OR
> --COMMIT TRAN
> My mind is shot today so use this explicit tran and see if it works if not
> roll back and tell me what is wrong and I'll look at it this weekend.
> HTH
> Jerry
> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
> news:6476D0E1-1054-429E-A456-35982AA76A29@.microsoft.com...
> > Sorry...I want to keep the entry with the latest date. Thanks for your
> > patience with my not providing complete details up front.
> >
> > "Jerry Spivey" wrote:
> >
> >> Mike,
> >>
> >> What criteria do you want for the DELETE? I.e., ok the comment column is
> >> the same for x records but the other columns may be different i.e,
> >> comment_id, dateentered, etc... Which one record do you want to retain?
> >>
> >> Thanks
> >>
> >> Jerry
> >>
> >>
> >> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in message
> >> news:5CB48CB2-7776-4C0A-BA97-D0ADDE8171C7@.microsoft.com...
> >> > CREATE TABLE [dbo].[StatusRptComments] (
> >> > [Comment_id] [int] IDENTITY (1, 1) NOT NULL ,
> >> > [Site_no] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> >> > [DateEntered] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
> >> > ,
> >> > [AdminID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> >> > [Comment] [varchar] (2500) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> >> > [Suellen_Comment] [varchar] (2500) COLLATE SQL_Latin1_General_CP1_CI_AS
> >> > NULL ,
> >> > [comment_type] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
> >> > ) ON [PRIMARY]
> >> > GO
> >> >
> >> > ALTER TABLE [dbo].[StatusRptComments] WITH NOCHECK ADD
> >> > CONSTRAINT [PK_StatusRptComments] PRIMARY KEY CLUSTERED
> >> > (
> >> > [Comment_id]
> >> > ) WITH FILLFACTOR = 90 ON [PRIMARY]
> >> > GO
> >> >
> >> > "Jerry Spivey" wrote:
> >> >
> >> >> Mike,
> >> >>
> >> >> Can you post the DDL for the table(s) so I don't have to guess about
> >> >> the
> >> >> keys?
> >> >>
> >> >> Thanks
> >> >>
> >> >> Jerry
> >> >> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in
> >> >> message
> >> >> news:62C48B0E-9B92-4036-9A23-3244889684A9@.microsoft.com...
> >> >> > Thanks, but how would I delete all the other records that I don't
> >> >> > want?
> >> >> >
> >> >> > "Jerry Spivey" wrote:
> >> >> >
> >> >> >> Mike,
> >> >> >>
> >> >> >> Depending on the size of the comment field you could do something
> >> >> >> like
> >> >> >> this:
> >> >> >>
> >> >> >> SELECT DISTINCT(CONVERT(VARCHAR(8000),COMMENT))
> >> >> >> FROM <TABLENAME>
> >> >> >>
> >> >> >> HTH
> >> >> >>
> >> >> >> Jerry
> >> >> >> "Mike Collins" <MikeCollins@.discussions.microsoft.com> wrote in
> >> >> >> message
> >> >> >> news:BB36D700-DB54-43BC-8734-70634BA8B7ED@.microsoft.com...
> >> >> >> >I have a table that has many duplicate records and I need to
> >> >> >> >delete
> >> >> >> >all
> >> >> >> >but
> >> >> >> > one of the duplicate records. Is there a way I can identify these
> >> >> >> > records
> >> >> >> > with a sql statement?
> >> >> >> >
> >> >> >> > Example record:
> >> >> >> > (Comment_id, site_id, date_entered, user_id, comment)
> >> >> >> > 6978, AL28492, 07/17/2005, 408, No longer a Franchise, replace w/
> >> >> >> > non-franchise signage.
> >> >> >> >
> >> >> >> > 8161, AL28492, 08/15/2005, 408, No longer a Franchise, replace w/
> >> >> >> > non-franchise signage.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >>
> >> >>
> >> >>
> >>
> >>
> >>
>
>

Finding and Deleting Duplicate Records

Monday, March 12, 2012

Find Records with duplicate values in 1 row keyed to 2 other rows

My table looks like this:
rowid, tenantid, chargecode, date, amount, ..........
I need to get all rows that have multiple entries on the same date(mm/yyyy)
for a given combination of tenantID and chargecode.
for example:
1 tenant1 5 1/1999 $500
2 tenant1 6 1/1999 $25
3 tenant1 5 1/1999 $35
4 tenant2 5 1/1999 $30
5 tenant2 5 2/1999 $30
6 tenant2 6 2/1999 $40
7 tenant2 6 2/1999 $50
I would need to get that tenant1 has 2 records with a chargecode of 5 for
1/1999
and that tenant2 has 2 records with chargecode 6 for 2/1999
SELECT tenantID, chargecode, date, count(*) as Duplciates
FROM TheTable
GROUP BY tenantID, chargecode, date
HAVING count(*) > 1
Roy Harvey
Beacon Falls, CT
On Thu, 22 Feb 2007 16:19:00 -0800, Joe S. <joecrew@.news.postalias>
wrote:

>My table looks like this:
>rowid, tenantid, chargecode, date, amount, ..........
>
>I need to get all rows that have multiple entries on the same date(mm/yyyy)
>for a given combination of tenantID and chargecode.
>for example:
>1 tenant1 5 1/1999 $500
>2 tenant1 6 1/1999 $25
>3 tenant1 5 1/1999 $35
>4 tenant2 5 1/1999 $30
>5 tenant2 5 2/1999 $30
>6 tenant2 6 2/1999 $40
>7 tenant2 6 2/1999 $50
>I would need to get that tenant1 has 2 records with a chargecode of 5 for
>1/1999
>and that tenant2 has 2 records with chargecode 6 for 2/1999

Wednesday, March 28, 2012

Finding Duplicates

Finding duplicate values in fields

finding duplicate values

Finding duplicate records using two columns

Finding duplicate records

Finding duplicate records

Finding Duplicate Records

Finding duplicate records

Finding Duplicate Names in a Simple Table

Finding Duplicate Foreign Keys

Finding duplicate entries in a "smart" way - by comparing first two words

Finding duplicate based on records in same table

Finding duplicate based on records in same table

Finding duplicate based on records in same table

Finding displaying and deleteing dulpicate records

Monday, March 26, 2012

finding and removing duplicates rows

Finding and Deleting Duplicate Records

Finding and Deleting Duplicate Records

Finding and Deleting Duplicate Records

Monday, March 12, 2012

Find Records with duplicate values in 1 row keyed to 2 other rows

Fragmented array

Blog Archive

About Me