Skip to content
Advertisement

SQL Select query optimization with indexing

A posts table contains 1 million rows. This table has a field with the name poster_id.

I have a list of followers by this poster_id.

I am trying to get a list of all activities from this followers (35 in this case but less or more is possible) in the last 48 hours.

I use this query:

SELECT post_id
     , topic_id
     , poster_id
     , post_time 
  FROM posts 
 WHERE post_time > 1606833542 
   AND poster_id IN (80202, 74247, 79290, 72488, 
                    111751, 85040, 100256, 68025, 
                    101088, 101598, 101950, 103252, 
                    103071, 80063, 100372, 102530, 109961, 
                    109854, 105626, 108967, 110391, 104423, 
                    113243, 111673, 113979, 104670, 127318, 
                    68252, 109606, 121393, 122991, 124489, 
                    127723, 126525) 
 ORDER 
    by post_time 
 LIMIT 100

Problem:

This query takes too long (0.4000 seconds) to execute.

The poster_id has an index of the post table.

How can I make this query faster?

Advertisement

Answer

try avoid the IN clause and use a join

    SELECT 
          p.post_id
        , p.topic_id
        , p.poster_id
        , p.post_time 
    FROM posts p
    INNER JOIN (
              SELECT 80202 poster_id
        UNION SELECT 74247
        UNION SELECT 79290 
        UNION SELECT  72488 
        UNION SELECT  111751 
        UNION SELECT  85040 
        UNION SELECT  100256 
        UNION SELECT  68025 
        UNION SELECT  101088 
        UNION SELECT  101598 
        UNION SELECT  101950 
        UNION SELECT  103252 
        UNION SELECT  103071 
        UNION SELECT  80063 
        UNION SELECT  100372 
        UNION SELECT  102530 
        UNION SELECT  109961 
        UNION SELECT  109854 
        UNION SELECT  105626 
        UNION SELECT  108967 
        UNION SELECT  110391 
        UNION SELECT  104423 
        UNION SELECT  113243 
        UNION SELECT  111673 
        UNION SELECT  113979 
        UNION SELECT  104670 
        UNION SELECT  127318 
        UNION SELECT  68252 
        UNION SELECT  109606 
        UNION SELECT  121393 
        UNION SELECT  122991 
        UNION SELECT  124489 
        UNION SELECT  127723 
        UNION SELECT  126525
    ) t ON t.poster_id = p.poster_id
        AND p.post_time > 1606833542 
    ORDER by p.post_time LIMIT 100

could be the value in the IN clause are form some subquery in this case ypou could use the related subquery instead of the UNION …..

WHERE IN clase is the same as serverl OR condition (several internal subquery) .. instead INNER JOIN just check the values in a single operation

for better performance, instead of you actual index on poster_id, you could try using a redundant index adding all the columns you select in your index eg:

create index my_index on posts (poster_id, post_id, topic_id , post_time )

in this way all the query value are obtained using the index and the query don’t need to access at the table ..

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement