I am working on the HackerRank Top Earners problem. The problem states the following:
We define an employee’s total earnings to be their monthly salary x months worked,
and the maximum total earnings to be the maximum total earnings for
any employee in the Employee table. Write a query to find the maximum
total earnings for all employees as well as the total number of
employees who have maximum total earnings. Then print these values as 2
space-separated integers.
The Employee table containing employee data for a company is described as follows:
Column | Type |
---|---|
employee_id | Integer |
name | String |
months | Integer |
salary | Integer |
The MS Server solution I found that works is:
SELECT MAX(months * salary), COUNT(salary*months) FROM employee WHERE salary * months IN ( SELECT MAX(salary * months) FROM employee );
What I am struggling to understand is how to break this solution down piece by piece. I know what is going on from SELECT to FROM employee, but after that I feel completely lost as to what this query is doing.
Advertisement
Answer
This is just a two part query:
Part 1:
SELECT MAX(salary * months) FROM employee
this gets you the max of salary * month. Pretty much that’s as expected, but now that you know the max you can use it to go back and get the rest of the what you wanted by using it as your criteria.
Let’s call all of the above X
Part 2:
SELECT MAX(salary * months) COUNT(salary * months) FROM employee WHERE salary * months IN (X);
That makes sense right? So just substituting X
with the subquery from part 1 is the complete picture on this query:
SELECT MAX(salary * months) COUNT(salary * months) FROM employee WHERE salary * months IN (SELECT MAX(salary * months) FROM employee);
So subqueries can be used to evaluate an expression that becomes part of the where clause in your main query.