I am trying to optimize my query which uses GROUP BY. But can’t it takes too long time.
I have a virtual table which contains 3 columns:
CREATE VIRTUAL TABLE GoiWordsWithSeq USING FTS5(idseq,sid,body);
I want the group to be just as the query below, but it just takes a bit too long to run. For example I tried this first which works pretty good but still I wonder if I can make it faster:
select GROUP_CONCAT(body), sid from GoiWordsWithSeq GROUP BY sid HAVING sid IN ( select sid from GoiWordsWithSeq where GoiWordsWithSeq.body match 'awesome' )
Then:
select GROUP_CONCAT(body), GoiWordsWithSeq.sid from GoiWordsWithSeq inner join ( select sid from GoiWordsWithSeq where GoiWordsWithSeq.body match 'awesome' ) GoiMatch ON GoiMatch.sid = GoiWordsWithSeq.sid GROUP BY GoiWordsWithSeq.sid
But it still takes long time to query. I’m trying to use JOIN because I heard it is usually faster but can’t really succed. Any advice would help!
Advertisement
Answer
Why did you choose to set the condition in the HAVING
clause?
With your code the condition is checked after all the aggregations are finished.
Your condition does not involve any aggregated column so there is no reason to use it in the HAVING
clause.
It would make more sense to set it in a WHERE
clause so the dataset is filtered and then GROUP BY
is applied for aggregation:
SELECT GROUP_CONCAT(body), sid FROM GoiWordsWithSeq WHERE sid IN ( SELECT sid FROM GoiWordsWithSeq WHERE body MATCH 'awesome' ) GROUP BY sid
You could also try EXISTS
which sometimes performs better:
SELECT GROUP_CONCAT(g.body), g.sid FROM GoiWordsWithSeq g WHERE EXISTS ( SELECT 1 FROM GoiWordsWithSeq WHERE sid = g.sid AND body MATCH 'awesome' ) GROUP BY g.sid