Skip to content
Advertisement

How to get distinct parent_sku count for the below dataset?

Need your help to extract distinct Parent_sku count from the below data set.

Condition: If one child of a parent_sku has “High” sales_tag then that parent_sku count should be excluded from “Low” sales_tag.

P.S. Sales_tag column is based on child_sku column.

Thank you for your help.

Dateset_&_Output

create temp table pb_sku_high as 
(
Select brand
,sales_tag
,count(distinct child_sku) as child_sku_count
,count(distinct parent_sku) as parent_sku_count
from pb_sku_base
Where sales_tag = 'High'
group by 1,2
);

drop table if exists pb_sku_low;

create temp table pb_sku_low as 
(
Select brand
,sales_tag
,count(distinct child_sku) as child_sku_count
,(select count(distinct parent_sku) from pb_sku_base 
        where parent_sku not in 
                           (
                            select parent_sku from pb_asins_base where sales_tag = 'High' group by 1
                           )
    ) as parent_sku_count 
from pb_asins_base
Where sales_tag = 'High'
group by 1,2
);

Select * from pb_sku_high
union all
select * from pb_sku_low;

Advertisement

Answer

The presented schema is non-conducive to this type of query, however we can normalise the structure to extract a parent and the child rows based on the Child_sku value.

Then we can LEFT JOIN the parent to the child row to record the count.

NOTE:
This specific query will not likely return the correct counts if there are more than 1 single child for each parent, it is not clear from the instructions how that should be treated though, so it mnight give you what you need.

SELECT parent.Brand, CASE WHEN parent.Sales_Tag = 'High' OR child.Sales_Tag = 'High' THEN 'High' ELSE 'Low' END as Sales_Tag, COUNT(*) as P_SKU_Count
FROM Product parent
LEFT JOIN Product child ON parent.Brand = child.Brand AND parent.Parent_sku = child.parent_sku AND parent.Child_Sku <> child.Child_Sku
WHERE parent.Child_sku LIKE '%_C1'
GROUP BY parent.Brand, CASE WHEN parent.Sales_Tag = 'High' OR child.Sales_Tag = 'High' THEN 'High' ELSE 'Low' END
Brand Sales_Tag P_SKU_Count
Nike High 3
Nike Low 2

You could also avoid the CASE in the GROUP BY statement and return the two counts in-line:

SELECT parent.Brand
     , SUM(CASE WHEN parent.Sales_Tag = 'High' OR child.Sales_Tag = 'High' THEN 1 END) as High
     , SUM(CASE WHEN parent.Sales_Tag = 'High' OR child.Sales_Tag = 'High' THEN 0 ELSE 1 END) as Low
FROM Product parent
LEFT JOIN Product child ON parent.Brand = child.Brand AND parent.Parent_sku = child.parent_sku AND parent.Child_Sku <> child.Child_Sku
WHERE parent.Child_sku LIKE '%_C1'
GROUP BY parent.Brand;
Brand High Low
Nike 3 2

There is a DB Fiddle here to test with: https://www.db-fiddle.com/f/s2hDvn8EU3QXcdwKqobhdc/0

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement