Skip to content
Advertisement

How to write SQL query without iterations

I have the following challenge: I have a table called hashtags_users_grouped which has the following structure:

In each row, we find values that tell me when a certain user mentioned a certain hashtag and how many times he did it. In this example, user 1 mentioned hashtag 123 one time and 245 three times, while user 2 only mentioned hashtag 123 five times.

I want to do a query that would give me the following output:

In other words, the same information as the first table, but with a column per hashtag, to know the number of times a user mentioned each hashtag.

It would be easy to do this using a recursive method (like using a Pyspark data frame and iterating over each hashtag), but I am looking forward to achieving it in a single query. Do you know any way to do this?

EDIT: User #Larnu said I should use PIVOT. How would you write a query using it? I tried but didn’t receive the expected results

Advertisement

Answer

Given the column naming you specify and the unknown list of hashtag values, I see you resorting to dynamic SQL.

However, here is a simple PIVOT example based on the data you shared that you may run in SSMS:

RETURNS

In a PIVOT, your column values (e.g. 123, 245 ) get transposed into column headers, hence the FOR hashtag_id IN ( [123], [245] ) part. To do this without dynamic SQL you would have to list a “FOR [column]” for every possible hashtag_id value. Given the unknown size of this list, it would quickly become unmanageable trying to maintain the above code when a new value is introduced. So, dynamic SQL to the rescue.

Depending on your version of SQL Server, this is how I might approach it using Dynamic SQL:

PRINT @headers;

PRINT @in;

PRINT @pivot;

And finally, to execute the dynamic SQL:

Note: The dynamic SQL example does not reference the table variable.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement