DB2 – find same rec between 2 big files

Question

I have 2 files with same structure (20 fields) and I must find if there exist equal record. The problem is that this files have more that 40 millions recs. So I have start a SELECT like: SELECT * FROM ...

Accepted Answer

I would recommend exists:select *from table_a awhere exists (    select 1    from table_b b    where b.field1 = a.field1 and b.field2 = a.field2 and ...);The benefits are:correctness: this checks for exact matches, while the method that concatenates the string may generate fake positives (not to mention conversion problems if some columns are not strings)performance: the expression in the from clause of the subquery is SARGable; if you were to create an index on at least a subset of these 40 columns, the database probably could take advantage of it when executing the queryNote: this does not properly handle null values; if you want to consider null values as equal, then you need more complicated expressions:where     (b.field1 = a.field1 or (b.field1 is null and a.field1 is null))    and (b.field2 = a.field2 or (b.field2 is null and a.field2 is null))    and ...

Advertisement

Answer