I have a large table that gets anywhere from 1-3 new entries per minute. I need to be able to find records at specific times which I can do by using a SELECT
statement but it’s incredibly slow. Lets say the table looks like this:
Device | Date-Time | Data | ----------------------------------- 1 | 2020-01-01 08:00 | 325 2 | 2020-01-01 08:01 | 384 1 | 2020-01-01 08:01 | 175 3 | 2020-01-01 08:01 | 8435 7 | 2020-01-01 08:02 | 784 . . .
I’m trying to get data like this:
SELECT * FROM table WHERE Date-Time = '2020-01-01 08:00' AND Device = '1'
I also need to get data like this:
SELECT * FROM table WHERE Date-Time > '2020-01-01 08:00' Date-Time < '2020-01-10 08:00' AND Device = '1'
But I don’t know what the Date-Time
will be until requested. In this case, I will have to search the entire table for these times. Can I index the start of the day so I know where dates are?
Is there a way to index this table in order to dramatically decrease the queries? Or is there a better way to achieve this?
I have tried indexing the Date-Time
column but I did not decrease the query time at all.
Advertisement
Answer
For this query:
SELECT * FROM mytable WHERE date_time = '2020-01-01 08:00' AND device = 1
You want an index on mytable(date_time, device)
. This matches the columns that come into play in the WHERE
clause, so the database should be able to lookup the matching rows efficiently.
Note that I removed the single quotes around the literal value given to device
: if this is an integer, as it looks like, then it should be treated as such.
The ordering of the column in the index matters; generally, you want the most restrictive column first – from the description of your question, this would probably be date_time
, hence the above suggestion. You might want to try the other way around as well (so: mytable(device, date_time)
).
Another thing to keep in mind from performance perspective: you should probably enumerate the columns you want in the SELECT
clause; if you just want a few additional columns, then it can be useful to add them to the index as well; this gives you a covering index, that the database can use to execute the whole query without even looking back at the data.
Say:
SELECT date_time, device, col1, col2 FROM mytable WHERE date_time = '2020-01-01 08:00' AND device = 1
Then consider:
mytable(date_time, device, col1, col2)
Or:
mytable(device, date_time, col1, col2)