I currently have two columns which is the result after flattening a json file in snowsql
discount id
10.00 1
20.00 1
10.00 2
30.00 3
I want to introduce a third column that shows total discount for a particular id
id total
1 30
2 10
3 30
Following is my code:
select
line_item.value:product_id::varchar as id,
discount_allocation.value:amount::decimal(36,2) as discount,
sum(discount) as total
from RAW_DATA_BUFFER,
lateral flatten(input=>src:line_items) line_item,
lateral flatten(input=>line_item.value:discount_allocations) discount_allocation
where record_type='shopify_order'
limit 10;
However, this gives error that [LINE_ITEM.VALUE]
is not a valid group by expression
Advertisement
Answer
If you need to have some additional “total” (of any kind, supported functions are listed in the documentation) column along with your original columns, you need to use analytical function here.
This way you need to put all the column to aggregate by in partition by <column list>
clause and you do not need a group by
.
select
line_item.value:product_id::varchar as id,
discount_allocation.value:amount::decimal(36,2) as discount,
sum(discount)
over(
partition by
line_item.value:product_id,
discount_allocation.value:amount
) as total
from RAW_DATA_BUFFER,
lateral flatten(input=>src:line_items) line_item,
lateral flatten(input=>line_item.value:discount_allocations) discount_allocation
where record_type='shopify_order'
limit 10;
Also it’s possible to calculate totals on totals with
select
display_cols,
sum(column_to_aggregate) as aggregated_column_to_aggregate,
sum(sum(column_to_aggregate))
over(partition by columns_to_total_by) as total_column_to_aggregate
from your_table
group by group_by_cols
where the first sum
calculates total grouped by group_by_cols
and the second (outer) sum
calculates an analytical function (totals) on aggregated result by columns_to_total_by
. One notice: both display_cols
and columns_to_total_by
should be in group_by_cols
columns since group by
is executed first).