Skip to content
Advertisement

Selecting new distinct values over time (ORACLE SQL)

I want to select new distinct values and track them over time.

I have a table where each row represents a score awarded to a particular person. – timestamp (when the score was awarded) – name (which person received the score) – score (what score the person received)

enter image description here

I want the result to look like:

enter image description here

The above table should be interpreted as how many new distinct names appear in each day.

Because 6-NOV is the first day, all the names are new hence 3 new names.
On 7-NOV Michael is the only new name so the value is 1.
On 8-NOV we have 3 new names (Don, Alex, Tina)
And on 9-NOV 0 new names appear a Jimmy and Sara have both been score before.

Thanks for the help

Advertisement

Answer

Find the minimum timestamp for each name and then count how many names in each timestamp

select timestamp, count(*) as new_names from
    (select name, min(timestamp) as timestamp from mytable
    group by name)
group by timestamp
order by timestamp

To include all days even without any names

select t.timestamp, nvl(new_names,0) as new_names from 
(select timestamp, count(*) as new_names from
        (select name, min(timestamp) as timestamp from mytable
        group by name)
group by timestamp) c
RIGHT OUTER JOIN (select distinct timestamp from mytable) t
ON c.timestamp = t.timestamp
order by t.timestamp

To include dates that don’t appear in the table at all you need to have a list of dates from a calendar somewhere and then put that table instead of the subquery I have RIGHT OUTER JOINed to

You can do this

select t.timestamp, nvl(new_names,0) as new_names from 
(select timestamp, count(*) as new_names from
        (select name, min(timestamp) as timestamp from mytable
        group by name)
group by timestamp) c
RIGHT OUTER JOIN (
SELECT TRUNC (SYSDATE - ROWNUM - 1) dt
  FROM DUAL CONNECT BY ROWNUM < 366
) t

ON c.timestamp = t.timestamp
order by t.timestamp

But you’d have to adjust the -1 and 366 to be the date range you wanted and it’s much more standard to use a calendar that already exists in your database

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement