SQL Query: Constructing a Control Group

Question

I have two data sets. The first data set contains two (uniquely) identifying characteristics &#8211; here ZIP and race &#8211; as well as a variable called count. The second data set contains information on &#8230;

Accepted Answer

You can use the row_number window function to number the rows by some criteria and then join that to data set 1.  Note that I renamed count to n here to avoid using a keyword:SELECT id,        sub.zip,        sub.race,        sub.outcome FROM  (    SELECT id,            zip,            race,            outcome,            row_number() OVER (partition by zip, race ORDER BY id) -- You can order by whatever you want    FROM data_set_2  ) subJOIN data_set_1 ON data_set_1.zip = sub.zip                AND data_set_1.race = sub.race                AND data_set_1.n >= row_number -- this will limit the results;

Advertisement

Answer