Skip to content
Advertisement

SQL group by function to categorize only the most recent data

So I have this table called title where it stores all of the title held by each employee which will look like this

emp_no title start_date
101 Engineer 2019-01-01
101 Senior Engineer 2020-02-01
102 Engineer 2019-01-11
102 Senior Engineer 2020-02-11
103 Engineer 2019-01-21
104 Engineer 2019-01-31
105 Associate 2019-01-01
106 Associate 2019-01-11
106 Manager 2020-02-11
107 Associate 2019-01-21
107 Manager 2020-02-21
108 Associate 2019-01-31

Notice that each employee can have more than 1 title. For example emp 101 title is engineer in 1st January 2019 but got promoted as senior engineer one year later.

Now lets say i want to count how many employees for each position. I have tried using the count function along with group by (to group the number of employee by the title) but the problem is, the SQL query also count the past position of every employee.

To be exact, I only want to include the most recent role that an employee currently has. So in this case, the result I am expecting is Engineer: 2 employees (because the other 2 has been promotod to senior engineer), Senior engineer: 2 employees, Associate: 2 employees (because the other 2 has been promotod to manager), Manager: 2 employees

Is there some kind of way to achieve that? NOTE: this table format is from one of the SQL online course that i’m taking so I’m not the one who make the table. and also in the original table in containes tens of thousands of data.

Advertisement

Answer

You can use not exists as follows:

select title, count(*) as Count
  from your_table t
 where not exists 
       (select 1 from your_table tt
         where tt.emp_no = t.emp_no and tt.start_date> t.start_date)
group by title
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement