I have 2 files with same structure (20 fields) and I must find if there exist equal record. The problem is that this files have more that 40 millions recs.
So I have start a SELECT like:
SELECT * FROM TABLE_A WHERE FIELD1||FIELD2||FIELD3||FIELD4...etc IN (SELECT FIELD1||FIELD2||FIELD3||FIELD4...etc FROM TABLE_B)
I have started more that 24 hours ago.
Can anybody tell me please if there is a more quick solution and how can I improved it please ?
Thanks all in advance
Denis
Advertisement
Answer
I would recommend exists
:
select * from table_a a where exists ( select 1 from table_b b where b.field1 = a.field1 and b.field2 = a.field2 and ... );
The benefits are:
correctness: this checks for exact matches, while the method that concatenates the string may generate fake positives (not to mention conversion problems if some columns are not strings)
performance: the expression in the
from
clause of the subquery is SARGable; if you were to create an index on at least a subset of these 40 columns, the database probably could take advantage of it when executing the query
Note: this does not properly handle null
values; if you want to consider null
values as equal, then you need more complicated expressions:
where (b.field1 = a.field1 or (b.field1 is null and a.field1 is null)) and (b.field2 = a.field2 or (b.field2 is null and a.field2 is null)) and ...