Skip to content
Advertisement

SQL query to split and keep only the top N values

I have the following table data:

| name  |items     |
--------------------
| Bob   |1, 2, 3   |
| Rick  |5, 3, 8, 4|
| Bill  |2, 4      |

I need to create a table with a split items column, but with the limitation to have at most N items per name. E.g. for N = 3 the table should look like this:

|name |item|
-----------
|Bob  |1   |
|Bob  |2   |
|Bob  |3   |
|Rick |5   | 
|Rick |3   | 
|Rick |8   |
|Bill |2   |
|Bill |4   |


I have the following query that splits items correctly, but doesn’t account for the maximum number N. What should I modify in the query (standard SQL, BigQuery) to account for N?

WITH data_split AS (
    SELECT name, SPLIT(items,',') AS item
    FROM (
        SELECT name, items
        -- A lot of additional logic here
        FROM data
    )
)

SELECT name, item
FROM data_split
CROSS JOIN UNNEST(data_split.item) AS item

Advertisement

Answer

Consider below approach (BigQuery)

select name, trim(item) item
from your_table, unnest(split(items)) item with offset 
where offset < 3       

if applied to sample data in your question – output is

enter image description here

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement