Skip to content
Advertisement

How do I remove duplicate data in SQL?

I need to display lines that are not repeated, but only with their neighbors, they should remain in the entire database (also in a single copy) in sqlite. Here’s what I’m missing: Initial database:

(id (INTEGER PRIMARY KEY)|security_id|date_and_time|price)
(1|AAPL|2020-08-21 09:59:51|100.2)
(2|AAPL|2020-08-21 09:59:51|100.2)
(3|AAPL|2020-08-21 09:59:51|100.2)
(4|AAPL|2020-08-21 09:59:52|100.2)
(5|AAPL|2020-08-21 09:59:52|100.3)
(6|AAPL|2020-08-21 09:59:52|100.2)
...

What should be output:

(4|AAPL|2020-08-21 09:59:51|100.2)
(5|AAPL|2020-08-21 09:59:52|100.3)
(6|AAPL|2020-08-21 09:59:52|100.2)
...

Thanks 🙂

Advertisement

Answer

Using Lag and Case:

SELECT X.ID, X.SECURITY_ID, X.DATE_AND_TIME, X.PRICE FROM (
SELECT CASE WHEN NEXT_VALUE = 0 THEN 1 
            WHEN NEXT_VALUE = PRICE THEN 0
            ELSE 1 END AS CHECK_VALUE, A.* FROM (
SELECT A.*,LEAD(A.PRICE, 1, 0) OVER(ORDER BY ID) NEXT_VALUE FROM 
TAB A ) A ) X WHERE CHECK_VALUE = 1 ORDER BY ID;
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement