I have ne table of student score
date | college_id | challenge_id | student_id | score |
---|---|---|---|---|
2020-10-01 | C11219 | CH10101 | S1000 | 80 |
2020-10-01 | C11219 | CH10101 | S1001 | 55 |
2020-10-02 | C11219 | CH10102 | S1000 | 87 |
2020-10-02 | C11219 | CH10102 | S1001 | 42 |
2020-10-03 | C11219 | CH10103 | S1000 | 95 |
2020-10-03 | C11219 | CH10103 | S1001 | 50 |
2020-10-01 | C11220 | CH10101 | S2000 | 57 |
2020-10-01 | C11220 | CH10101 | S2001 | 84 |
2020-10-02 | C11220 | CH10102 | S2000 | 17 |
2020-10-02 | C11220 | CH10102 | S2001 | 34 |
2020-10-03 | C11220 | CH10103 | S2000 | 75 |
2020-10-03 | C11220 | CH10103 | S2001 | 12 |
I want to identify students who had a positive cumulative improvement in their scores. For example: Student S1000 of college C11219 has a positive cumulative improvement score of 15 calculated as (87 – 80) + (95 – 87). Whereas, Student S1001 of college C11219 has a negative cumulative improvement score of -5 calculated as (42 – 55) + (50 – 42).
Advertisement
Answer
The cumulative score does not use intermediate score values.
date | ScoreName | Score |
---|---|---|
2020-10-01 | StartingScore | 80 |
2020-10-02 | Intermediary1 | 87 |
2020-10-03 | Intermediary2 | 95 |
2020-10-04 | Intermediary3 | 50 |
2020-10-05 | Intermediary4 | 55 |
2020-10-06 | EndScore | 95 |
Based on your example, the cummulative score would be :
(87-80)+(95-87)+(50-95)+(55-50)+(95-55)=15
Or we could write is as
(Intermediary1-StartingScore)+(Intermediary2-Intermediary1)+(Intermediary3-Intermediary2)+(Intermediary4-Intermediary3)+(EndScore-Intermediary4)
which if we open up the parentheses, is
EndScore-StartingScore => 95-80=15
Thus, what we need are the first and last score for each student ordered by date. We get that using window functions in a common table expression that we later on filter. I assume the student is identified using the tuple (college_id,student_id)
WITH CTE AS ( SELECT DISTINCT college_id, student_id, FIRST_VALUE(score) OVER (PARTITION BY college_id,student_id ORDER BY date ASC) as StartScore, FIRST_VALUE(score) OVER (PARTITION BY college_id,student_id ORDER BY date DESC) as EndScore FROM YourTable ) SELECT college_id,student_id FROM CTE WHERE EndScore-StartScore > 0