Skip to content
Advertisement

Finding top 10 products sold in a year

I have these tables below along with the definition. I want to find top 10 products sold in a year after finding counts and without using aggregation and in an optimized way. I want to know if aggregation is still needed or I can accomplish it without using aggregation. Below is the query. Can anyone suggest a better approach.

CREATE TABLE Customer (
  id int not null,
  first_name VARCHAR(30),
  last_name VARCHAR(30),
  Address VARCHAR(60),
  State VARCHAR(30),
  Phone text,
  PRIMARY KEY(id)
);



CREATE TABLE Product (
  ProductId int not null,
  name VARCHAR(30),
  unitprice int,
  BrandID int,
  Brandname varchar(30),    
  color VARCHAR(30),
  PRIMARY KEY(ProductId)
);


Create Table Sales (
SalesId int not null,
Date date,
Customerid int,
Productid int,
Purchaseamount int,
PRIMARY KEY(SalesId),
FOREIGN KEY (Productid) REFERENCES Product(ProductId),
FOREIGN KEY (Customerid) REFERENCES Customer(id)    
)

Sample Data:

  insert into 
       Customer(id, first_name, last_name, address, state, phone)
values 
      (1111, 'andy', 'johnson', '123 Maryland Heights', 'MO', 3211451234),
      (1112, 'john', 'smith', '237 Jackson Heights', 'TX', 3671456534),
      (1113, 'sandy', 'fleming', '878 Jersey Heights', 'NJ', 2121456534),
      (1114, 'tony', 'anderson', '789 Harrison Heights', 'CA', 6101456534)



insert into 
       Product(ProductId, name, unitprice, BrandId, Brandname)
values 
      (1, 'watch',200, 100, 'apple'),
      (2, 'ipad', 429, 100, 'apple'),
      (3, 'iphone', 799, 100, 'apple'),
      (4, 'gear', 300, 110, 'samsung'),
      (5, 'phone',1000, 110, 'samsung'),
      (6, 'tab', 250,  110, 'samsung'),
      (7, 'laptop', 1300, 120, 'hp'),
      (8, 'mouse', 10, 120, 'hp'),
      (9, 'monitor', 400, 130, 'dell'),
      (10, 'keyboard', 40, 130, 'dell'),
      (11, 'dvddrive', 100, 130, 'dell'),
      (12, 'dvddrive', 90, 150, 'lg')


insert into 
       Sales(SalesId, Date, CustomerID, ProductID, Purchaseamount)
values (30, '01-10-2019', 1111, 1, 200),
       (31, '02-10-2019', 1111, 3, 799),
       (32, '03-10-2019', 1111, 2, 429),
       (33, '04-10-2019', 1111, 4, 300),
       (34, '05-10-2019', 1111, 5, 1000),
       (35, '06-10-2019', 1112, 7, 1300),
       (36, '07-10-2019', 1112, 9, 400),
       (37, '08-10-2019', 1113, 5, 2000),
       (38, '09-10-2019', 1113, 4, 300),
       (39, '10-10-2019', 1113, 3, 799),
       (40, '11-10-2019', 1113, 2, 858),
       (41, '01-10-2020', 1111, 1, 400),
       (42, '02-10-2020', 1111, 2, 429),
       (43, '03-10-2020', 1112, 7, 1300),
       (44, '04-10-2020', 1113, 7, 2600),
       (45, '05-10-2020', 1114, 7, 1300),
       (46, '06-10-2020', 1114, 7, 1300),
       (47, '07-10-2020', 1114, 9, 800)

Tried this:

SELECT PCY.Name, PCY.Year, PCY.SEQNUM
FROM (SELECT P.Name AS Name, Extract('Year' from S.Date) AS YEAR, COUNT(P.Productid) AS CNT,
             RANK() OVER (PARTITION BY Extract('Year' from S.Date) ORDER BY COUNT(P.Productid) DESC) AS RANK
      FROM Sales S inner JOIN
           Product P
           ON S.Productid = P.Productid
      
     ) PCY
WHERE PCY.RANK <= 10;

I am seeing this error:

ERROR:  column "p.name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 2: FROM (SELECT P.Name AS Name, Extract('Year' from S.Date) AS ...
                     ^
SQL state: 42803
Character: 52

Advertisement

Answer

I don’t understand why you don’t want to use an aggregate function when you have to aggregate over your data. This query works fine, without any issues on the GROUP BY:

WITH stats AS (
    SELECT EXTRACT
        ( YEAR FROM DATE ) AS y,
        P.productid,
        P.NAME,
        COUNT ( * ) numbers_sold,
        RANK ( ) OVER ( PARTITION BY EXTRACT ( YEAR FROM DATE ) ORDER BY COUNT ( * ) DESC ) r 
    FROM
        product
        P JOIN sales S ON S.Productid = P.Productid 
    GROUP BY
        1,2 
)
SELECT  y
    ,   name
    ,   numbers_sold
FROM    stats
WHERE   r <= 10;

This works because the productid is the primary key that has a functional dependency to the product name.

By the way, tested on version 12, but it should work on older and newer versions as well.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement