Friday, February 24, 2012

Find 90th percentile scores for each student

I am stumped on a set-based approach for this one.

A cursor approach is straightforward enough, but i want to avoid that.

Here's my table:

create table StudentScores
(
id int primary key identity(1,1),
student_id int not null,
score int not null
)

with some sample data:

insert into StudentScores (student_id, score)
select 1, 10 union all
select 1, 29 union all
select 1, 50 union all
select 1, 53 union all
select 1, 45 union all
select 1, 10 union all
select 1, 29 union all
select 1, 50 union all
select 1, 53 union all
select 1, 45 union all
select 1, 88 union all
select 2, 23 union all
select 2, 54 union all
select 2, 55 union all
select 2, 34 union all
select 2, 56 union all
select 2, 78 union all
select 2, 23 union all
select 2, 54 union all
select 2, 55 union all
select 2, 34 union all
select 2, 56 union all
select 2, 78 union all
select 2, 23 union all
select 2, 54 union all
select 2, 55 union all
select 2, 34 union all
select 2, 56 union all
select 2, 78 union all
select 2, 98

What I want is, for each student, what is their 90th percentile score?

For a given single student, one possibility would be:

declare @.studentid int
set @.studentid = 2
select top 1 @.studentid as student_id, a.score as [90th percentile score]
from
(
select top 90 percent score from StudentScores
where student_id = @.studentid order by score asc
) as a
order by a.score desc

But I want this for all students, and not use a cursor.

Any ideas?

Thanks!So ... do you want a single maximum score fro each student that is in the top 90th percentile, or all scores that are in the 90th percentile?

If you derive a table of 90th percentile points, then maybe the outer query can bring back the student_id and max score.

Not perfect, but maybe a start:


select StudentScores.student_id, max(StudentScores.score) score, s.[90th percentile score]
from StudentScores
inner join (
select student_id, (sum(score) * 9) / 100 [90th percentile score]
from StudentScores
group by student_id) s ON s.Student_id = StudentScores.student_id
where StudentScores.score > s.[90th percentile score]
group by StudentScores.student_id, StudentScores.score, s.[90th percentile score]
order by StudentScores.student_id desc|||What I want is this:

1. just to be concrete, say each student has 100 scores.
2. for each student, order the student's scores from highest to lowest.
3. pick the 10th score down the list for each student.

If each student had 200 scores, you would pick the 20th score down the list. That's what I mean by 90th percentile score.

Does that make sense?

tomh53 - It's not immediately clear to me that your query produces this. I'll study it more and see. :)|||Perhaps something like this:

select distinct student_id, score
from StudentScores ss
where ss.score = (
select min(score)
from StudentScores ss2
where ss2.studenti_id = ss.studentid and score in (
select top 11 percent score
from StudentScores
where student_id = ss2.studentid
)
)

I guess that top 11 percent would give the the 90th percent as the min of the group, or does that not work?

R|||The article linked below sheds some light on percentiles in tsql:

http://www.sqlteam.com/item.asp?ItemID=16480

Good luck.|||lytri,

your solution worked, except you forgot to add an order by clause in the innermost query. Other than that little change it's perfect.

thanks!|||Ahh yes, that would probably help wouldn't it...

Well, glad I could take a break from work and help someone out :)

No comments:

Post a Comment