Skip to content
Advertisement

max DISTINCT returns multiple rows

I am working on an sql script which is executed by a .bat daily and outputs a list of IDs, the date of access, and their level. While it returns what I want, mostly, I noticed that some of the outputted rows are duplicates. Could someone please help me modify my script so that it outputs only one date (the latest) for each ID? Thank you very much.

SELECT T.ID
 + ';' + substring(convert(char, convert(date , T.QDATE ) ), 1, 10)
 + ';' + A.[LEVEL]
FROM 

    (SELECT CID AS 'ID',
        MAX (DISTINCT EDATE) QDATE
        FROM [XXXXXXXXXXXXXXXXXXXXXXXX].[XXX].[XXXXXXXXXXXXXXX]
        GROUP BY CID
    ) T , 
    [XXXXXXXXXXXXXXXXXXXXXXXX].[XXX].[XXXXXXXXXXXXXXX] A

WHERE
    T.ID = A.CID
AND T.QDATE = A.EDATE

ORDER BY A.[CID]

EDIT: I’ve added a bit of sample data from table A

| QID | CID | LEVEL |    EDATE   | OP | STATUS |
|-----|-----|-------|------------|----|--------|
|  1  |00001|  LOW  | 2021-07-16 | 01 | CLOSED |
|  2  |00001|  LOW  | 2021-07-16 | 01 | CLOSED |
|  3  |00002| MEDIUM| 2021-07-16 | 01 | CLOSED |
|  4  |00003|  LOW  | 2021-07-16 | 01 | CLOSED |

In this bit of data, my output contains both rows for CID 00001. Looking for a way to delete the duplicate rows from the output and not make any modifications to the db itself.

Advertisement

Answer

Your data is showing only a date portion context of your EDate field. Is is really a date or date/time. It would suggest date/time due to your call to CONVERT( Date, T.QDate) in the query. Your sample data SHOULD show context of time, such as to the second. I would not suspect there are multiple records with the same time-stamp to the second, but its your data.

The DISTINCT should not be at the inner query, but the OUTER query, but IF you have multiple entries for the same CID AT the exact same time AND there are multiple values for Leve, OP, and Status, then you will get multiple.

However, if the values are the same across-the-board as in your sample data, you SHOULD be good with

SELECT DISTINCT 
      T.ID + ';' 
         + substring(convert(char, convert(date , T.QDATE ) ), 1, 10)
         + ';' + A.[LEVEL]
   FROM 
      ( SELECT  
              CID AS 'ID',
              MAX (EDATE) QDATE
           FROM 
              [XXXXXXXXXXXXXXXXXXXXXXXX].[XXX].[XXXXXXXXXXXXXXX]
           GROUP BY 
              CID ) T
       JOIN [XXXXXXXXXXXXXXXXXXXXXXXX].[XXX].[XXXXXXXXXXXXXXX] A
          ON T.ID = A.CID
         AND T.QDATE = A.EDATE
   ORDER BY 
      A.CID

The distinct keyword in this context means only give me 1 unique record per each combination of all columns. So in your sample data, you would only have 1 record result for the CID = ‘00001’.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement