I currently have two columns which is the result after flattening a json file in snowsql
discount id 10.00 1 20.00 1 10.00 2 30.00 3
I want to introduce a third column that shows total discount for a particular id
id total 1 30 2 10 3 30
Following is my code:
select line_item.value:product_id::varchar as id, discount_allocation.value:amount::decimal(36,2) as discount, sum(discount) as total from RAW_DATA_BUFFER, lateral flatten(input=>src:line_items) line_item, lateral flatten(input=>line_item.value:discount_allocations) discount_allocation where record_type='shopify_order' limit 10;
However, this gives error that [LINE_ITEM.VALUE] is not a valid group by expression
Advertisement
Answer
If you need to have some additional “total” (of any kind, supported functions are listed in the documentation) column along with your original columns, you need to use analytical function here.
This way you need to put all the column to aggregate by in partition by <column list> clause and you do not need a group by.
select
line_item.value:product_id::varchar as id,
discount_allocation.value:amount::decimal(36,2) as discount,
sum(discount)
over(
partition by
line_item.value:product_id,
discount_allocation.value:amount
) as total
from RAW_DATA_BUFFER,
lateral flatten(input=>src:line_items) line_item,
lateral flatten(input=>line_item.value:discount_allocations) discount_allocation
where record_type='shopify_order'
limit 10;
Also it’s possible to calculate totals on totals with
select
display_cols,
sum(column_to_aggregate) as aggregated_column_to_aggregate,
sum(sum(column_to_aggregate))
over(partition by columns_to_total_by) as total_column_to_aggregate
from your_table
group by group_by_cols
where the first sum calculates total grouped by group_by_cols and the second (outer) sum calculates an analytical function (totals) on aggregated result by columns_to_total_by. One notice: both display_cols and columns_to_total_by should be in group_by_cols columns since group by is executed first).