Showing posts with label occurs. Show all posts
Showing posts with label occurs. Show all posts

Thursday, March 29, 2012

Finding duplicates - what on earth am I overlooking?

I have two lists of contacts. They're similar. I want a list of all the contacts whose email address occurs only in the first list.

SELECT COUNT(DISTINCT EMAIL) FROM List1
returns 13460

SELECT COUNT(DISTINCT EMAIL) FROM List2
returns 13220

SELECT EMAIL FROM List1 WHERE EMAIL NOT IN (SELECT DISTINCT EMAIL FROM List2)
returns 0 rows

How can it be returning no rows? What am I failing to take into consideration?

:confused:selecting from lis2 where not in ( list 1 )|||Looks ok to me. This will be more efficient so give it a whirl and see if it sorts out your problem anyway:
SELECT EMAIL
FROM List1
WHERE NOT EXISTS(SELECT * FROM List2 WHERE List2.Email = List1.Email)|||Yup, thanks :) I added a bit to take the nulls out, but this seems to work. At least I get rows returned. I'm still at a loss as to why my first effort didn't work though... :confused:

SELECT EMAIL FROM List1
WHERE NOT EXISTS(SELECT * FROM List2 WHERE List2.email = List1.EMAIL AND List2.email IS NOT NULL)
AND List1.EMAIL IS NOT NULL|||You shouldn't need to explicitly remove NULLS... unless you have turned off ANSI_NULLS.|||I've no idea what state my ansi_nulls are in, but both tables have records where the EMAIL field is null, and if I run the query without the nulls-removing bits, just over half of the 13,000-or-so email addresses returned are NULL. I'm just anticipating the account manager coming back to me with "Who's this NULL bloke and why is he spamming our contacts database?"|||Sorry yes - not thinking through. And good looking blocking that NULL fella ;)

Friday, February 24, 2012

Find a date that occurs every 2 days from a fixed date

Hi
I am trying to write a script that only sums the values in a column if the
date for the record occurs every 2 days after another date. The trouble is,
it is a recurring 2 day cycle. i.e Only sum the values if the record date is
every 2nd day from the first date.
eg. First Date=01/03/06. Only sum the values if the record date is 03/03,
05/03, 07/04 etc. If the record does not occur every second day, it should
return 0.
I can use datediff to get the difference and I can cast it into a decimal
and divide by 2 but can't get it to validate in the statement.
Is there an easier way?Mark,
Post your DDL & T-SQL ?
Robert
"Mark - HIS" <MarkHIS@.discussions.microsoft.com> wrote in message
news:43502397-8D5F-46F6-B15D-4B3B384CCBF0@.microsoft.com...
> Hi
> I am trying to write a script that only sums the values in a column if the
> date for the record occurs every 2 days after another date. The trouble
> is,
> it is a recurring 2 day cycle. i.e Only sum the values if the record date
> is
> every 2nd day from the first date.
> eg. First Date=01/03/06. Only sum the values if the record date is 03/03,
> 05/03, 07/04 etc. If the record does not occur every second day, it should
> return 0.
> I can use datediff to get the difference and I can cast it into a decimal
> and divide by 2 but can't get it to validate in the statement.
> Is there an easier way?
>|||Here is the short version of a complex query:
select guest.arrival, multirate.room_type, multirate.rate_date,
count (multirate.room_type)
from guest, multirate
where guest.property=multirate.property and guest.account=multirate.account
and
multirate.rate_date='xxxxx'
The problem is this:
Assume the guest.arrival='2006-03-01'
If the multirate.rate_date in the where statment is '2006-03-02', it must
ignore the record from the count because the datediff between 02/03 and 01/0
3
is not a multiple of 2.
If the multitrate.rate_date='2006-03-03', it must count the record into the
result because it occurs 2 days after the arrival.
The multirate.rate_date will be a variable passed to the statement, so the
query must be able to validate that the rate_date has a multiple of 2
difference between the rate_date and arrival.
Does this make sense?
"Robert Ellis" wrote:

> Mark,
> Post your DDL & T-SQL ?
> Robert
> "Mark - HIS" <MarkHIS@.discussions.microsoft.com> wrote in message
> news:43502397-8D5F-46F6-B15D-4B3B384CCBF0@.microsoft.com...
>
>|||Mark:
Does the following example help? There may be other ways to do it.
IF OBJECT_ID('blah') IS NOT NULL DROP TABLE blah
CREATE TABLE blah
(
BlahGroup CHAR(2) NOT NULL,
FirstDate DATETIME NOT NULL,
SecondDate DATETIME NOT NULL
)
GO
SET DATEFORMAT YMD
INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
'2006-01-02'
INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
'2006-01-03'
INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
'2006-01-04'
INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
'2006-01-05'
INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
'2006-01-06'
INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
'2006-01-07'
INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
'2006-01-03'
INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
'2006-01-04'
INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
'2006-01-05'
INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
'2006-01-06'
GO
SELECT b.BlahGroup, CountOfBlahGroup = COUNT(b.BlahGroup)
FROM blah b
WHERE ((CAST(DATEDIFF(dd, b.FirstDate, b.SecondDate) AS DECIMAL(10,1))/2) =
(FLOOR(CAST(DATEDIFF(dd, b.FirstDate, b.SecondDate) AS DECIMAL(10,1))/2)))
GROUP BY b.BlahGroup
GO
"Mark - HIS" <MarkHIS@.discussions.microsoft.com> wrote in message
news:5C913A70-67B7-49C4-B377-2C43115CA907@.microsoft.com...
> Here is the short version of a complex query:
> select guest.arrival, multirate.room_type, multirate.rate_date,
> count (multirate.room_type)
> from guest, multirate
> where guest.property=multirate.property and
> guest.account=multirate.account
> and
> multirate.rate_date='xxxxx'
> The problem is this:
> Assume the guest.arrival='2006-03-01'
> If the multirate.rate_date in the where statment is '2006-03-02', it must
> ignore the record from the count because the datediff between 02/03 and
> 01/03
> is not a multiple of 2.
> If the multitrate.rate_date='2006-03-03', it must count the record into
> the
> result because it occurs 2 days after the arrival.
> The multirate.rate_date will be a variable passed to the statement, so the
> query must be able to validate that the rate_date has a multiple of 2
> difference between the rate_date and arrival.
> Does this make sense?
> "Robert Ellis" wrote:
>|||Additionally:
obviously you would add something like:
AND (FirstDate = @.ParamDate)
to the 'WHERE' clause for your implementation...
cheers,
Robert
"Robert Ellis" <robe_2k5.n0sp8m@.hotmail.co.uk> wrote in message
news:%23LZjNahQGHA.4344@.TK2MSFTNGP12.phx.gbl...
> Mark:
> Does the following example help? There may be other ways to do it.
> IF OBJECT_ID('blah') IS NOT NULL DROP TABLE blah
> CREATE TABLE blah
> (
> BlahGroup CHAR(2) NOT NULL,
> FirstDate DATETIME NOT NULL,
> SecondDate DATETIME NOT NULL
> )
> GO
> SET DATEFORMAT YMD
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
> '2006-01-02'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
> '2006-01-03'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
> '2006-01-04'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
> '2006-01-05'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
> '2006-01-06'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
> '2006-01-07'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
> '2006-01-03'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
> '2006-01-04'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
> '2006-01-05'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
> '2006-01-06'
> GO
> SELECT b.BlahGroup, CountOfBlahGroup = COUNT(b.BlahGroup)
> FROM blah b
> WHERE ((CAST(DATEDIFF(dd, b.FirstDate, b.SecondDate) AS DECIMAL(10,1))/2)
> =
> (FLOOR(CAST(DATEDIFF(dd, b.FirstDate, b.SecondDate) AS DECIMAL(10,1))/2)))
> GROUP BY b.BlahGroup
> GO
>
> "Mark - HIS" <MarkHIS@.discussions.microsoft.com> wrote in message
> news:5C913A70-67B7-49C4-B377-2C43115CA907@.microsoft.com...
>|||% modulus
argh
"Robert Ellis" <robe_2k5.n0sp8m@.hotmail.co.uk> wrote in message
news:%23LZjNahQGHA.4344@.TK2MSFTNGP12.phx.gbl...
> Mark:
> Does the following example help? There may be other ways to do it.
> IF OBJECT_ID('blah') IS NOT NULL DROP TABLE blah
> CREATE TABLE blah
> (
> BlahGroup CHAR(2) NOT NULL,
> FirstDate DATETIME NOT NULL,
> SecondDate DATETIME NOT NULL
> )
> GO
> SET DATEFORMAT YMD
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
> '2006-01-02'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
> '2006-01-03'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
> '2006-01-04'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
> '2006-01-05'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'aa', '2006-01-01',
> '2006-01-06'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
> '2006-01-07'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
> '2006-01-03'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
> '2006-01-04'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
> '2006-01-05'
> INSERT blah (BlahGroup, FirstDate, SecondDate) SELECT 'bb', '2006-01-02',
> '2006-01-06'
> GO
> SELECT b.BlahGroup, CountOfBlahGroup = COUNT(b.BlahGroup)
> FROM blah b
> WHERE ((CAST(DATEDIFF(dd, b.FirstDate, b.SecondDate) AS DECIMAL(10,1))/2)
> =
> (FLOOR(CAST(DATEDIFF(dd, b.FirstDate, b.SecondDate) AS DECIMAL(10,1))/2)))
> GROUP BY b.BlahGroup
> GO
>
> "Mark - HIS" <MarkHIS@.discussions.microsoft.com> wrote in message
> news:5C913A70-67B7-49C4-B377-2C43115CA907@.microsoft.com...
>