I am joining a small table and a very large table and want to return a distinct item if ANY items match. The table is so large that it takes hours for something that I think should take seconds.
The problem is that I am “iterating” over every single entry in the second table. I want to be able to “break” once a condition is met and return that value instead of continuing over every single account.
In the code below, I am finding every single row for each name
that I am joining, even though I am only returning the DISTINCT example.name
and don’t care about every row. How can I return DISTINCT.name
after finding the first instance of new_ex.data = ...
after performing the INNER JOIN
?
SELECT DISTINCT example.name FROM ( SELECT DISTINCT ex.user AS name FROM exampleTable ex WHERE ex.timestamp >= '2022-01-01' AND ex.group = 'test' AND new_ex.data = '123' ) AS example_users INNER JOIN exampleTable new_ex on example_users.name = new_ex.user AND new_ex.timestamp >= '2022-01-01' AND ( OR new_ex.data = 'abc' OR new_ex.data = 'def' OR new_ex.data = 'ghi' -- ~10 more of these OR statements )
Advertisement
Answer
Without seeing the data it’s hard to be sure this can’t be simplified further, but I think you can at least boil this down to
select distinct ex.user as name from exampleTable ex where ex.timestamp >= '2022-01-01' and ex.group = 'test' AND new_ex.data = '123' and exists ( select 1 from exampleTable new_ex where new_ex.user=ex.name and new_ex.data = '123' and new_ex.timestamp >= '2022-01-01' and new_ex.data in ('abc','def','ghi'...) )