Let say that I have this table “contract” which have duplicated records in the “END” column for the same ID.
ID | Begin | End |
---|---|---|
20 | 2016-01-01 | 9999-12-31 |
20 | 2020-01-01 | 9999-12-31 |
30 | 2018-01-01 | 2019-02-28 |
30 | 2019-03-01 | 9999-12-31 |
30 | 2020-02-01 | 9999-12-31 |
10 | 2019-01-01 | 2019-06-30 |
10 | 2019-07-01 | 2020-02-29 |
10 | 2020-03-01 | 9999-12-31 |
I want to get the oldest date in the “Begin” column for all the ID’s that have duplicated records in the “END” column with the date “9999-12-31”. So for this example I expect to get:
ID | Begin |
---|---|
20 | 2016-01-01 |
30 | 2019-03-01 |
I made an SQL script, but there should be a better way.
x
select ID, MIN(Begin) from
(
select * from contract m where exists
(
select 1 from contract v where END = '9999-12-31' and v.ID = m.ID
having count(ID)=2
)
and END = '9999-12-31'
)a
group by FUN_ID
Advertisement
Answer
If it is a big table, you really want to use EXISTS for finding duplicates because it will short circuit. Here’s two ways to use EXISTS that might help with what you are trying to do.
DROP TABLE IF EXISTS #Test;
CREATE TABLE #Test
(
ID INT NOT NULL
,[Begin] DATE NOT NULL
,[End] DATE NOT NULL
)
;
INSERT INTO #Test
VALUES
(20,'2016-01-01','9999-12-31')
,(20,'2020-01-01','9999-12-31')
,(30,'2018-01-01','2019-02-28')
,(30,'2019-03-01','9999-12-31')
,(30,'2020-02-01','9999-12-31')
,(10,'2019-01-01','2019-06-30')
,(10,'2019-07-01','2020-02-29')
,(10,'2020-03-01','9999-12-31')
;
--See all duplicates with OldestBegin for context
SELECT
TST.ID
,TST.[Begin]
,TST.[End]
,OldestBegin = MIN([Begin]) OVER (PARTITION BY TST.ID,TST.[End])
FROM #Test AS TST
WHERE EXISTS
(
SELECT 1
FROM #Test AS _TST
WHERE TST.ID = _TST.ID
AND TST.[End] = _TST.[End]
AND TST.[Begin] <> _TST.[Begin]
)
;
--Get only oldest duplicate
SELECT
TST.ID
,TST.[End]
,[Begin] = MIN([Begin])
FROM #Test AS TST
WHERE EXISTS
(
SELECT 1
FROM #Test AS _TST
WHERE TST.ID = _TST.ID
AND TST.[End] = _TST.[End]
AND TST.[Begin] <> _TST.[Begin]
)
GROUP BY
TST.ID
,TST.[End]
;