Skip to content
Advertisement

Write an SQL query that reports the best seller by total sales price, If there is a tie, report them all

I’m doing this leetcode question on SQL, and I have some questions that I wanted to ask you.

-- Table: Product

-- +--------------+---------+
-- | Column Name  | Type    |
-- +--------------+---------+
-- | product_id   | int     |
-- | product_name | varchar |
-- | unit_price   | int     |
-- +--------------+---------+
-- product_id is the primary key of this table.
-- Table: Sales

-- +-------------+---------+
-- | Column Name | Type    |
-- +-------------+---------+
-- | seller_id   | int     |
-- | product_id  | int     |
-- | buyer_id    | int     |
-- | sale_date   | date    |
-- | quantity    | int     |
-- | price       | int     |
-- +------ ------+---------+
-- This table has no primary key, it can have repeated rows.
-- product_id is a foreign key to Product table.
 

-- Write an SQL query that reports the best seller by total sales price, If there is a tie, report them all.

-- The query result format is in the following example:

-- Product table:
-- +------------+--------------+------------+
-- | product_id | product_name | unit_price |
-- +------------+--------------+------------+
-- | 1          | S8           | 1000       |
-- | 2          | G4           | 800        |
-- | 3          | iPhone       | 1400       |
-- +------------+--------------+------------+

-- Sales table:
-- +-----------+------------+----------+------------+----------+-------+
-- | seller_id | product_id | buyer_id | sale_date  | quantity | price |
-- +-----------+------------+----------+------------+----------+-------+
-- | 1         | 1          | 1        | 2019-01-21 | 2        | 2000  |
-- | 1         | 2          | 2        | 2019-02-17 | 1        | 800   |
-- | 2         | 2          | 3        | 2019-06-02 | 1        | 800   |
-- | 3         | 3          | 4        | 2019-05-13 | 2        | 2800  |
-- +-----------+------------+----------+------------+----------+-------+

-- Result table:
-- +-------------+
-- | seller_id   |
-- +-------------+
-- | 1           |
-- | 3           |
-- +-------------+
-- Both sellers with id 1 and 3 sold products with the most total price of 2800.

-- Solution
Select a.seller_id
from
(select seller_id, 
rank() over(order by sum(price) desc) as rk
from sales
group by seller_id) a
where a.rk=1

This is my solution:

with x as 
(select seller_id, sum(price) as total
from(select distinct * from sales)
group by seller_id
)
select seller_id
from x
where total = (select max(total) from x)

My questions are:

  1. First, is my solution true?
  2. Is it necessary select distinct * from sales ? I do it whenever I see This table has no primary key, it can have repeated rows. The others didn’t do it this way, so I’m wondering whether I did it correctly, and it bothered me a lot.
  3. I’m new with SQL, and I write everything using only select, from, where, group by, having, order by. I’m not confident to use for example rank()/row_number(),... over(partition by)..., so my solutions are usually obviously longer than those given by others. I would like to know if the queries written as my way take indeed (a lot) more time to execute? (or how to know/compute the running time of a query?).

Thanks.

Advertisement

Answer

In SQL Server you can use TOP(1) WITH TIES to return not just 1 row, but all top rows with the same total value:

select TOP(1) WITH TIES
    seller_id
    ,sum(price) as total
from sales
group by seller_id
ORDER BY total DESC

No, you do not need DISTINCT. It does not just make the query slower, it may produce incorrect results. In your case it doesn’t matter if the sales table has a primary key or not.

Your query would produce correct results if you remove DISTINCT.

If you use window function like rank() over() the query would likely be faster than your version, but a lot depends on the actual DBMS that you use, available indexes and other things. Most DBMS have methods to see how they run/execute the query. In SQL Server you can see the so-called execution plan, which gives very good idea of what the engine does to produce the result, how much data from which tables and how many times it reads, what transformations and calculations it performs.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement