I’m trying to build a performance score on each player in my table and use this to run some analysis. I’ve tried to use the Elo ranking using the player’s score in the game vs themselves in the prior game. The game is a time-trial race so the faster they complete the better their performance. I simulate a win (1) when the current race is faster than their previous race, a loss (0) when it is slower and draw (.5) when the timing difference is 0.
There are two components to Elo’s algorithm…the Outcome Estimation for all players (two in my case – Ea, Eb) and the ranking update (Ra, Rb).
I have the current setup to simulate one calculation
select player_id, race_ind, k, sa, sb, pre_ra, pre_rb, div0(1, 1 + pow(10,(pre_rb-pre_ra)/400)) ea, div0(1, 1 + pow(10,(pre_ra-pre_rb)/400)) eb, round(pre_ra + k * (sa-ea)) post_ra, round(pre_rb + k * (sb-eb)) post_rb from t
Which returns the below.
player_id | race_ind | k | sa | sb | pre_ra | pre_rb | ea | eb | post_ra | post_rb |
---|---|---|---|---|---|---|---|---|---|---|
xxxxx | 1 | 24 | 1 | 0 | 1000 | 1000 | 0.5 | 0.5 | 1012 | 988 |
xxxxx | 2 | 24 | 1 | 0 | null | null | null | null | null | null |
xxxxx | 3 | 24 | 0 | 1 | null | null | null | null | null | null |
xxxxx | 4 | 24 | 1 | 0 | null | null | null | null | null | null |
But I need to write it in a way that on the subsequent row pre_rb and pre_ra is updated with the previous row’s post_ra and post_rb after calculation so the result should return this table instead
player_id | race_ind | k | sa | sb | pre_ra | pre_rb | ea | eb | post_ra | post_rb |
---|---|---|---|---|---|---|---|---|---|---|
xxxxx | 1 | 24 | 1 | 0 | 1000 | 1000 | 0.5 | 0.5 | 1012 | 988 |
xxxxx | 2 | 24 | 1 | 0 | 1012 | 988 | 0.5344839447 | 0.4655160553 | 1023 | 977 |
xxxxx | 3 | 24 | 0 | 1 | 1023 | 977 | 0.5658152031 | 0.4341847969 | 1009 | 991 |
xxxxx | 4 | 24 | 1 | 0 | 1009 | 991 | 0.5258809309 | 0.4741190691 | 1020 | 980 |
Advertisement
Answer
So reaping the point that values calculated by this process are meaningless. Here is how you can do it. I stripped the formula down to avoid math errors that occur in the CTE due to ambiguous values selection occurring, which can be avoided but each values needs to only refer to the r
or d
values and not the prior calculated intermediate values of the “same row”
So with this data:
create table data (player_id number, race_ind number, k float, sa float); insert into data VALUES (100, 1, 24, 1), (100, 2, 24, 1), (100, 3, 24, 0), (100, 4, 24, 1);
this recursive CTE
with RECURSIVE rec_cte as ( select d.player_id, d.race_ind, d.sa, round(1000::float + k * (d.sa - div0(1, 1 + pow(10,(0::float )/400)))) post_ra, round(1000::float + k * (1-d.sa - div0(1, 1 + pow(10,(0::float )/400)))) post_rb from data as d where race_ind = 1 UNION ALL select d.player_id, d.race_ind, d.sa, round(r.post_ra + k::float * (d.sa-div0(1, 1 + pow(10,(r.post_rb-r.post_ra)/400)))) as post_ra, round(r.post_rb + k::float * (1-d.sa-div0(1, 1 + pow(10,(r.post_ra-r.post_rb)/400)))) as post_rb from rec_cte as r join data as d where r.player_id = d.player_id and r.race_ind + 1 = d.race_ind ) select * from rec_cte
gives:
PLAYER_ID | RACE_IND | SA | POST_RA | POST_RB |
---|---|---|---|---|
100 | 1 | 1 | 1,012 | 988 |
100 | 2 | 1 | 1,023 | 977 |
100 | 3 | 0 | 1,009 | 991 |
100 | 4 | 1 | 1,020 | 980 |