Skip to content
Advertisement

line_item.value is not a valid group by expression

I currently have two columns which is the result after flattening a json file in snowsql

discount     id
   10.00      1
   20.00      1
   10.00      2
   30.00      3

I want to introduce a third column that shows total discount for a particular id

id    total
1       30
2       10
3       30

Following is my code:

select
  line_item.value:product_id::varchar as id, 
  discount_allocation.value:amount::decimal(36,2) as discount,
  sum(discount) as total
from RAW_DATA_BUFFER,
  lateral flatten(input=>src:line_items) line_item,
  lateral flatten(input=>line_item.value:discount_allocations) discount_allocation
where record_type='shopify_order'
limit 10;

However, this gives error that [LINE_ITEM.VALUE] is not a valid group by expression

Advertisement

Answer

If you need to have some additional “total” (of any kind, supported functions are listed in the documentation) column along with your original columns, you need to use analytical function here.

This way you need to put all the column to aggregate by in partition by <column list> clause and you do not need a group by.

select
  line_item.value:product_id::varchar as id, 
  discount_allocation.value:amount::decimal(36,2) as discount,
  sum(discount)
    over(
      partition by
        line_item.value:product_id,
        discount_allocation.value:amount
    ) as total
from RAW_DATA_BUFFER,
  lateral flatten(input=>src:line_items) line_item,
  lateral flatten(input=>line_item.value:discount_allocations) discount_allocation
where record_type='shopify_order'
limit 10;

Also it’s possible to calculate totals on totals with

select
  display_cols,
  sum(column_to_aggregate) as aggregated_column_to_aggregate,
  sum(sum(column_to_aggregate))
    over(partition by columns_to_total_by) as total_column_to_aggregate
from your_table
group by group_by_cols

where the first sum calculates total grouped by group_by_cols and the second (outer) sum calculates an analytical function (totals) on aggregated result by columns_to_total_by. One notice: both display_cols and columns_to_total_by should be in group_by_cols columns since group by is executed first).

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement