Skip to content
Advertisement

How to return records with a specific resolution (uniform distributed) of a dataset in SQL

I face a database with a lot of entries (~19 million). On my server queries take about 20 seconds and return a lot of results. But in the frontend (a chart) I don’t need every single row, so my idea is to return only 100 elements of the resulting rows. But not the 100 first or the 100 last: 100 uniformly distributed elements of those rows. In other words, I want to be able to specify a resolution of my data.

Here is some pseudocode.

SELECT time, type, value
FROM [table_name]
WHERE time >= [start_date] AND
      time <= [end_date] AND
      type = [any_type]
LIMIT 100 UNIFORMLY DISTRIBUTED OVER RESULTS;

Do you have any ideas on how to write the query for this? Is this even possible?

Advertisement

Answer

You can use analytical funcrion count and row_number as follows (please use functions and syntax according to your database):

Select * from
(SELECT time, type, value,
       Row_number(1) over (order by time) as rn,
       Count(1) over() as cnt
FROM [table_name]
WHERE time >= [start_date] AND
      time <= [end_date] AND
      type = [any_type]) t
Where mod(rn,trunc(cnt/100)) = 0
Order by time limit 100
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement