Skip to content
Advertisement

SQL group by is bringing lots of duplicates

I’m trying to group my data by OccDate, ParkingContextType, AuthorityId and of course, i would like to measure each sessions for each date/contexttype etc… However, I`m also trying to calculate the sum of sessions between a specific hour 8-18 for each date,ParkingContextType & AuthorityId.

The problem here is that 1 specific date is duplicated a lot and also the authority id etc… Something I’m doing wrong here and I cannot spot the problem,

Any one who can help me please?

Here is the SQL

WITH ParkeonCTE
AS
(
SELECT 
    OccDate = CONVERT(DATE, OC.LocalStartTime),
    TotalOccSessions = COUNT(OC.SessionId),
    AuthorityId,
    TotalOccDuration = ISNULL(SUM(OC.DurationMinutes),0),
    TotalNumberOfOverstay = SUM(CAST(OC.IsOverstay AS INT)),
    TotalMinOfOverstays = ISNULL(SUM(OC.OverStayDurationMinutes),0),
    (CASE
        WHEN OC.OspId IS NULL THEN 'OffStreet' ELSE 'OnStreet'
        END
    ) AS ParkingContextType,
    SUM(CASE
        WHEN CAST(OC.LocalStartTime AS TIME) >= '08:00:00' 
        AND CAST(OC.LocalStartTime AS TIME) <= '18:00:00'
            THEN 1
            ELSE 0
     END
    ) AS TotalRestrictedSessions
FROM Analytics.OccupancySessions AS OC
WHERE OC.AuthorityId IS NOT NULL
GROUP BY  CONVERT(DATE,OC.LocalStartTime), OC.AuthorityId,OC.OspId
 )
  SELECT OC.OccDate,
   OC.ParkingContextType,
   OC.AuthorityId,
   OC.TotalRestrictedSessions,
   SUM(OC.TotalOccSessions) AS TotalOccSessions,
   AVG(OC.TotalOccDuration) AS AvgOccMinutesDuration, -- wrong
   SUM(OC.TotalOccDuration) AS TotalOccDuration,
   SUM(OC.TotalNumberOfOverstay) AS TotalNumberOfOverstay,
   SUM(OC.TotalMinOfOverstays) AS TotalMinOfOverstays,
   CAST(AVG(OC.TotalMinOfOverstays) AS decimal(10,2)) AS AvgMinOfOverstays -- wrong
  FROM ParkeonCTE AS OC
  GROUP BY OC.OccDate, OC.AuthorityId, OC.ParkingContextType,OC.TotalRestrictedSessions
  ORDER BY OC.OccDate DESC

This is the end result, which is wrong…I should have less OccDate, with Two parkingcontexttype OnStreet/OffStreet for each authority id… But apparently I have loads of them.

enter image description here

And a minimal result it should be:

OccDate    ParkingContextType    AuthorityId .. (All of the calculated columns)
2019/11/18 OnStreet              32432-3424-32423
2019/11/18 OffStreet             32432-3424-32423

Advertisement

Answer

You should use Aggregated function

Change SUM(OC.TotalRestrictedSessions) in Select and Remove TotalRestrictedSessions From Group By

WITH ParkeonCTE
AS
(
 SELECT 
    OccDate = CONVERT(DATE, OC.LocalStartTime),
    TotalOccSessions = COUNT(OC.SessionId),
    AuthorityId,
    TotalOccDuration = ISNULL(SUM(OC.DurationMinutes),0),
    TotalNumberOfOverstay = SUM(CAST(OC.IsOverstay AS INT)),
    TotalMinOfOverstays = ISNULL(SUM(OC.OverStayDurationMinutes),0),
    (CASE
        WHEN OC.OspId IS NULL THEN 'OffStreet' ELSE 'OnStreet'
        END
    ) AS ParkingContextType,
    SUM(CASE
        WHEN CAST(OC.LocalStartTime AS TIME) >= '08:00:00' 
        AND CAST(OC.LocalStartTime AS TIME) <= '18:00:00'
            THEN 1
            ELSE 0
     END
    ) AS TotalRestrictedSessions
 FROM Analytics.OccupancySessions AS OC
 WHERE OC.AuthorityId IS NOT NULL
 GROUP BY  CONVERT(DATE,OC.LocalStartTime), OC.AuthorityId,OC.OspId
)
SELECT OC.OccDate,
   OC.ParkingContextType,
   OC.AuthorityId,
   SUM(OC.TotalRestrictedSessions),
   SUM(OC.TotalOccSessions) AS TotalOccSessions,
   AVG(OC.TotalOccDuration) AS AvgOccMinutesDuration, 
   SUM(OC.TotalOccDuration) AS TotalOccDuration,
   SUM(OC.TotalNumberOfOverstay) AS TotalNumberOfOverstay,
   SUM(OC.TotalMinOfOverstays) AS TotalMinOfOverstays,
   CAST(AVG(OC.TotalMinOfOverstays) AS decimal(10,2)) AS AvgMinOfOverstays
  FROM ParkeonCTE AS OC
  GROUP BY OC.OccDate, OC.AuthorityId, OC.ParkingContextType
  ORDER BY OC.OccDate DESC
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement