I am working on Steam data and need to find the top 3 tags people used for each game but the problem is all the tag values are in the same row for each game like the image below (appid column then 370 column for each tag and values).
Result can be anything like 200,300,400 or column name plus tag value like action | 200, fps | 300, rpg | 400 etc. for each one of 36k games. The table looks like below and has around 36k games. I definitely don’t want to find one column only or top 3 games.
Appid is unique and there are around 370 columns for each tag which I prepared in a proper format for SQL already. Could you please help me out?
Advertisement
Answer
If you unpivot the data the query becomes trivial. To unpivot the data, you’ll need to retrieve the table metadata and assemble a big SQL according the logic shown below. See the markers on the query below.
The solution will look like:
select * from ( select *, row_number() over(partition by appid order by cnt desc) as rn from ( -- unpivot the data using dynamic SQL select appid, "1980" as tag, [1980] as cnt from t union all select appid, "1990", [1990] from t union all select appid, "2_5d", [2_5d] from t union all select appid, "2d", [2d] from t union all select appid, "2d_fighter", [2d_fighter] from t union all ... select appid, "last_tag", [last_tag] from t -- finish unpivoting ) x ) y where rn <= 3 order by appid, rn