I have a dataset as given in the link, DataSet
I want to segregate the column “order_item_unit_status” as separate column and bring respective transaction amount for the same. Desired output is given below.
Objective is to consolidate the txn_amt into respective categories and group them based on txn_date_key. (Basically pivoting based on order_item_unit_status column and bringing txn_amt respectively.)
I used the below code,
Select *, CASE WHEN order_item_unit_status ='DELIVERED' THEN txn_amt ELSE 0 END as DELIVERED, CASE WHEN order_item_unit_status ='RETURNED' THEN txn_amt ELSE 0 END as RETURNED from sales
Got output as referred in the link Output
The output is not grouping based on txn_date_key and multiple line items found. If i use GROUP BY txn_date_key
an error is thrown.
Also I was informed that server is supported by HiveSQL and does not support of using “:”, date time, and temp tables can not be created. I’m currently stuck on how to go about given the constraints.
Help would be much appreciated
Advertisement
Answer
All columns which are not aggregated and selected shold be in group by.
This query produces result you need:
Select txn_date_key, sum(CASE WHEN order_item_unit_status ='DELIVERED' THEN txn_amt ELSE 0 END) as DELIVERED, sum(CASE WHEN order_item_unit_status ='RETURNED' THEN txn_amt ELSE 0 END) as RETURNED from sales group by txn_date_key
Result:
txn_date_key delivered returned 20190701 3200 0 20210631 0 3000