I have an Activity table in which user’s activity on actions is stored. There is two type of actions in column type that are required to check:
- add_favorite
- remove_favorite
So same object can be added or removed from favorites multiple times so table has records like:
| id | type | timestamp | object_id | user | |----|-----------------|------------|-----------|-------| | 1 | add_favorite | 1584150783 | 4 | user1 | | 2 | add_favorite | 1584151341 | 5 | user1 | | 3 | remove_favorite | 1584161259 | 4 | user1 | | 4 | add_favorite | 1584168758 | 4 | user1 | | 5 | remove_favorite | 1584171635 | 10 | user1 | | 6 | add_favorite | 1584174201 | 11 | user1 | | 7 | remove_favorite | 1584177194 | 5 | user1 |
First I was constructing two queries that was collecting all object_id’s with add_favorite actions for the user in array. Then I was filtering this array with another one set of object_id’s which had type of action removed_favorite
SELECT object_id FROM activity WHERE user='user1' AND type='add_favorite'; SELECT object_id FROM activity WHERE user='user1' AND type='remove_favorite';
But this is highly ineffective on large number of activities. Please advise me query with LEFT JOIN which would be more effective and will give the end result
Advertisement
Answer
You can simplify your query to:
SELECT object_id FROM activity WHERE user = 'user1' AND type IN ('add_favorite', 'remove_favorite');
Then I would recommend aggregation:
SELECT object_id FROM activity WHERE user = 'user1' AND type IN ('add_favorite', 'remove_favorite') GROUP BY object_id HAVING SUM(type = 'add_favorite') > SUM(type = 'remove_favorite');
This gets the objects that have net more adds than removes.
An alternative is to look at the last action and only choose those that are adds:
SELECT a.object_id FROM activity a WHERE a.user = 'user1' AND a.timestamp = (SELECT MAX(a2.timestamp) FROM activity a2 WHERE a2.user = a.user AND a2.type IN ('add_favorite', 'remove_favorite') ) AND a.type = 'add_favorite'
With an index on activity(user, type, timestamp)
, this probably has better performance.