Skip to content
Advertisement

UNION two SELECT queries but result set is smaller than one of them

In a SQL Server statement there is

SELECT id, book, acnt, prod, category from Table1 <where clause...>
UNION
SELECT id, book, acnt, prod, category from Table2 <where clause...>

The first query returned 131,972 lines of data; the 2nd one, 147,692 lines. I didn’t notice there is any commonly shared line of data from these two tables, so I expect the result set after UNION should be the same as the sum of 131972 + 147692 = 279,384.

However the result set after UNION is 133,857. Even though they might have overlapped lines that I accidently missed, the result should be at least the same as the larger result set of those two. I can’t figure how the number 133,857 came from.

Is my understanding about SQL UNION correct? I use SQL server in this case.

Advertisement

Answer

To expand comment given under the question, which I think states what you already know:

UNION takes care of duplicates also within one table as well.

Just take a look at a example:

SETUP:

create table tbl1 (col1 int, col2 int);
insert into tbl1 values
(1,2),
(3,4);
create table tbl2 (col1 int, col2 int);
insert into tbl1 values
(1,2),
(1,2),
(1,2),
(3,4);

Query

select * from tbl1
union 
select * from tbl2;

will produce output

col1 | col2
-----|------
1    | 2
3    | 4

DB fiddle

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement