I am joining a small table and a very large table and want to return a distinct item if ANY items match. The table is so large that it takes hours for something that I think should take seconds.
The problem is that I am “iterating” over every single entry in the second table. I want to be able to “break” once a condition is met and return that value instead of continuing over every single account.
In the code below, I am finding every single row for each name
that I am joining, even though I am only returning the DISTINCT example.name
and don’t care about every row. How can I return DISTINCT.name
after finding the first instance of new_ex.data = ...
after performing the INNER JOIN
?
SELECT DISTINCT example.name
FROM (
SELECT DISTINCT ex.user AS name
FROM exampleTable ex
WHERE ex.timestamp >= '2022-01-01'
AND ex.group = 'test'
AND new_ex.data = '123'
) AS example_users
INNER JOIN exampleTable new_ex on example_users.name = new_ex.user
AND new_ex.timestamp >= '2022-01-01'
AND (
OR new_ex.data = 'abc'
OR new_ex.data = 'def'
OR new_ex.data = 'ghi'
-- ~10 more of these OR statements
)
Advertisement
Answer
Without seeing the data it’s hard to be sure this can’t be simplified further, but I think you can at least boil this down to
select distinct ex.user as name
from exampleTable ex
where ex.timestamp >= '2022-01-01'
and ex.group = 'test'
AND new_ex.data = '123'
and exists (
select 1
from exampleTable new_ex
where new_ex.user=ex.name
and new_ex.data = '123'
and new_ex.timestamp >= '2022-01-01'
and new_ex.data in ('abc','def','ghi' )
)