Wednesday, March 28, 2012

Finding Consecutive Records (Based Upon A Integer Column)

Hi all,
Hopefully one of you brainboxes can help out with a tricky problem
that I just can't get my head around!
I have a table with a single integer column with the example data
(column name = ID):
1,2,3,4,5,7,8,9,10,12,14,16,17
And I need essentially the start value and size of each consecutive
group, ie:
ID Size
-- --
1 5
7 4
12 1
14 1
16 2
In that order (essentially so I can do a TOP 1, WHERE Size >= @.minsizerequired) to find the first group that has enough consecutive
rows in it.
I've been working like this so far:
SELECT l.id AS firstid,
(
SELECT MIN(a.id) FROM @.t a
LEFT OUTER JOIN @.t b ON a.id = b.id - 1
WHERE b.id IS NULL AND a.id >= l.id
) AS lastid
FROM @.t l
LEFT OUTER JOIN @.t r ON r.id = l.id - 1
WHERE r.id IS NULL
But I can't get the size of each group without performing yet another
subquery which I'd like to avoid (performance is a big issue here).
Has anyone got any bright ideas or am I barking up the wrong tree
entirely? I'd also like to avoid using ranking and window functions if
possible as this has to be 2000 compatible.
Thanks very much,
KieranKidogg,
I think your query gives you the first and last id's in each run,
so you can instead ask for
select
l.id as firstid,
(
SELECT MIN(a.id) FROM @.t a
LEFT OUTER JOIN @.t b ON a.id = b.id - 1
WHERE b.id IS NULL AND a.id >= l.id
) - firstid + 1 AS runlength
FROM @.t l
LEFT OUTER JOIN @.t r ON r.id = l.id - 1
WHERE r.id IS NULL
You might also try this approach:
SELECT
MIN(T.id) AS id,
MAX(T.id) - MIN(T.id) + 1 AS runlength
FROM (
SELECT b.id, count(a.id) as rk
FROM W AS a
JOIN W AS b
ON a.id <= b.id
GROUP BY b.id
) T
GROUP BY T.id - T.rk
-- Steve Kass
-- Drew University
Kidogg wrote:
>Hi all,
>Hopefully one of you brainboxes can help out with a tricky problem
>that I just can't get my head around!
>I have a table with a single integer column with the example data
>(column name = ID):
>1,2,3,4,5,7,8,9,10,12,14,16,17
>And I need essentially the start value and size of each consecutive
>group, ie:
>ID Size
>-- --
>1 5
>7 4
>12 1
>14 1
>16 2
>In that order (essentially so I can do a TOP 1, WHERE Size >=>@.minsizerequired) to find the first group that has enough consecutive
>rows in it.
>I've been working like this so far:
>SELECT l.id AS firstid,
>(
> SELECT MIN(a.id) FROM @.t a
> LEFT OUTER JOIN @.t b ON a.id = b.id - 1
> WHERE b.id IS NULL AND a.id >= l.id
>) AS lastid
>FROM @.t l
>LEFT OUTER JOIN @.t r ON r.id = l.id - 1
>WHERE r.id IS NULL
>But I can't get the size of each group without performing yet another
>subquery which I'd like to avoid (performance is a big issue here).
>Has anyone got any bright ideas or am I barking up the wrong tree
>entirely? I'd also like to avoid using ranking and window functions if
>possible as this has to be 2000 compatible.
>Thanks very much,
>Kieran
>
>sql

No comments:

Post a Comment