Showing posts with label situation. Show all posts
Showing posts with label situation. Show all posts

Wednesday, March 28, 2012

finding duplicate values

Here's my situation: I work at a healthcare testing labratory, I have a table with all our tests, Dim_Test. The two columns i want to look at are: [Mnemonic] and [Test Name]. Mnemonic is the short test name and Test Name is a long name. The condition of the data is that there are a few duplicate mnemonics, but the the duplicates have different long test names. I know this is really messed up but thats our lab system for you, i have no control over that. I want to make a table that has all mnemonics that are duplicated, and the Long test name for each of the duplications. I have come up with a couple possible solutions but i dont know how to implement them or which would be the best course of action.

The First: Sort on the Mnemonic and delete duplicates, store that to a temp table. Then somehow compare Dim_test and the temp table to find the rows missing from the temp table and then store that to, say, temp2. So temp2 would contain all the data that was deleted during the deletion of duplicates on Dim_test. Then i could use the input Dim_test and do a lookup on table temp2 which would give me all mnemonics and test names with duplicate mnemonics.

The Second and third: Somehow get a dataset that has the distinct mnemonic and the count of each distinct mnemonic. Then i could use a conditional split to find out which mnemonics have a count of greater than one. then i can store that to a temp table and lookup on that. That temp table would be the same as temp2 in the first solution. The difference between the second and third solutions is that one uses an aggregate transformation and the other uses a pivot transformation.

I think the pivot transformation would work best if i pivoted on mnemonic but i dont know how to set it up. Please help. Thank you.

You don't need SSIS for this. This T-SQL will do the trick:

Code Snippet

use tempdb

go

CREATE Table [Dim_Test]

(

[Mnemonic] NVARCHAR (20),

[Test Name] NVARCHAR (200)

)

INSERT INTO [Dim_Test] VALUES ('AAA', 'This is a long test name 1')

INSERT INTO [Dim_Test] VALUES ('BBB', 'This is a long test name 2')

INSERT INTO [Dim_Test] VALUES ('CCC', 'This is a long test name 3')

INSERT INTO [Dim_Test] VALUES ('DDD', 'This is a long test name 4')

INSERT INTO [Dim_Test] VALUES ('EEE', 'This is a long test name 5')

INSERT INTO [Dim_Test] VALUES ('AAA', 'This is a long test name 6')

INSERT INTO [Dim_Test] VALUES ('FFF', 'This is a long test name 7')

INSERT INTO [Dim_Test] VALUES ('BBB', 'This is a long test name 8')

INSERT INTO [Dim_Test] VALUES ('AAA', 'This is a long test name 9')

GO

WITH DuplicatesCTE

AS

(

SELECT [Mnemonic]

FROM [Dim_Test]

GROUP BY [Mnemonic]

HAVING COUNT (*) > 1

)

SELECT [Dim_Test].*

FROM [Dim_Test]

INNER JOIN [DuplicatesCTE]

ON [Dim_Test].[Mnemonic] = [DuplicatesCTE].[Mnemonic]

If your source system is using SQL 2000 you will have to replace the CTE with a derived table, but the technique is the same.

Does this give you what you need?

|||select a.mnemonic, a.[test name] from dim_test a inner join
(select mnemonic, count(*) from dim_test group by mnemonic having count(*) > 1) b
on a.mnemonic = b.mnemonic

This should give you a list of all of the duplicate mnemonics with their [test name]s.

If you want to create a NEW table with this information add an "into tablename" clause like so:

select a.mnemonic, a.[test name] into tablename
from dim_test a inner join
(select mnemonic, count(*) from dim_test group by mnemonic having count(*) > 1) b
on a.mnemonic = b.mnemonic|||

Both of these seemed to work, i ended up using Matthew's solution. I should really learn more about the SQL language, i pretty much do everything in SSIS. Thank you both very much.

Matthew could you please help me compact this. Please excuse my ignorance of the basic SQL language. So all the data is in Dim_test and the part i left out before was that there is also data with duplicate Mnemonics and Test Names example:

VALUES ('AAA', 'This is a long test name 1')

VALUES ('AAA', 'This is a long test name 1')

So this is what i want to do in a Stored procedure, and I need to do it without changing any data in Dim_test:

1. Disregard all duplicates of Mnemonic AND [Test Name] in Dim_test

2. DELETE FROM DupMnemonics

3. Store the results of your query in DupMnemonics

Thank you so much.

This is the query i ended up using

WITH DuplicatesCTE AS (SELECT Mnemonic
FROM DupMnemonics
GROUP BY Mnemonic
HAVING (COUNT(*) > 1))
SELECT DupMnemonics_1.Mnemonic, DupMnemonics_1.[Test Name]
FROM DupMnemonics AS DupMnemonics_1 INNER JOIN
DuplicatesCTE AS DuplicatesCTE_1 ON DupMnemonics_1.Mnemonic = DuplicatesCTE_1.Mnemonic

|||An outer SELECT DISTINCT should eliminate those duplicates.|||

You can use a CTE to do this as well, when used in combination with the ROW_NUMBER ranking funcctiokn. Here's an example:

http://www.eggheadcafe.com/software/aspnet/29974931/removing-duplicate-rows-.aspx

|||

MatthewRoche wrote:

You can use a CTE to do this as well, when used in combination with the ROW_NUMBER ranking funcctiokn. Here's an example:

http://www.eggheadcafe.com/software/aspnet/29974931/removing-duplicate-rows-.aspx

Yeah, to the OP, there are so many ways to achieve this, you may want to move your discussions to the Transact-SQL forum. This is, after all, off topic in the SSIS forum... Wink|||

Is there a way to just move this thread over to T-SQL

Thank you all.

|||

heff89 wrote:

Is there a way to just move this thread over to T-SQL

Thank you all.

There is. I'll do that now for you.

Friday, February 24, 2012

Find a name of an index in deadlock situation.

Hi
I have a deadlock occurring.
Using MSSQL 2005 current activity lock screens I can see one process is
waiting for a lock to be released on a Database object. with objectId =
1441294802092032 and type = KEY - so I know I have a lock on an Index
and what table the index belongs to,
But how can I find the name of the index that is blocked?
This was part of the default view in SQL2000 activity manager. I cannot
use object_name (1441294802092032 ).
Pain is cannot even export the details of the locks per process to a
file.
Yitzak
http://blogs.msdn.com/bartd/archive/2006/09/09/747119.aspx
"Yitzak" <terryshamir@.yahoo.co.uk> wrote in message
news:1164814916.178901.38620@.16g2000cwy.googlegrou ps.com...
> Hi
> I have a deadlock occurring.
> Using MSSQL 2005 current activity lock screens I can see one process is
> waiting for a lock to be released on a Database object. with objectId =
> 1441294802092032 and type = KEY - so I know I have a lock on an Index
> and what table the index belongs to,
> But how can I find the name of the index that is blocked?
> This was part of the default view in SQL2000 activity manager. I cannot
> use object_name (1441294802092032 ).
> Pain is cannot even export the details of the locks per process to a
> file.
>

Find a name of an index in deadlock situation.

Hi
I have a deadlock occurring.
Using MSSQL 2005 current activity lock screens I can see one process is
waiting for a lock to be released on a Database object. with objectId = 1441294802092032 and type = KEY - so I know I have a lock on an Index
and what table the index belongs to,
But how can I find the name of the index that is blocked'
This was part of the default view in SQL2000 activity manager. I cannot
use object_name (1441294802092032 ).
Pain is cannot even export the details of the locks per process to a
file.Yitzak
http://blogs.msdn.com/bartd/archive/2006/09/09/747119.aspx
"Yitzak" <terryshamir@.yahoo.co.uk> wrote in message
news:1164814916.178901.38620@.16g2000cwy.googlegroups.com...
> Hi
> I have a deadlock occurring.
> Using MSSQL 2005 current activity lock screens I can see one process is
> waiting for a lock to be released on a Database object. with objectId => 1441294802092032 and type = KEY - so I know I have a lock on an Index
> and what table the index belongs to,
> But how can I find the name of the index that is blocked'
> This was part of the default view in SQL2000 activity manager. I cannot
> use object_name (1441294802092032 ).
> Pain is cannot even export the details of the locks per process to a
> file.
>

Find a name of an index in deadlock situation.

Hi
I have a deadlock occurring.
Using MSSQL 2005 current activity lock screens I can see one process is
waiting for a lock to be released on a Database object. with objectId =
1441294802092032 and type = KEY - so I know I have a lock on an Index
and what table the index belongs to,
But how can I find the name of the index that is blocked'
This was part of the default view in SQL2000 activity manager. I cannot
use object_name (1441294802092032 ).
Pain is cannot even export the details of the locks per process to a
file.Yitzak
http://blogs.msdn.com/bartd/archive.../09/747119.aspx
"Yitzak" <terryshamir@.yahoo.co.uk> wrote in message
news:1164814916.178901.38620@.16g2000cwy.googlegroups.com...
> Hi
> I have a deadlock occurring.
> Using MSSQL 2005 current activity lock screens I can see one process is
> waiting for a lock to be released on a Database object. with objectId =
> 1441294802092032 and type = KEY - so I know I have a lock on an Index
> and what table the index belongs to,
> But how can I find the name of the index that is blocked'
> This was part of the default view in SQL2000 activity manager. I cannot
> use object_name (1441294802092032 ).
> Pain is cannot even export the details of the locks per process to a
> file.
>