Skip to content
Advertisement

Find missing entries by date

I have two redshift tables:

  • alert_type: where i keep types of alerts in my system
  • alert: where i keep alerts

Every day I generate a new alert for each alert type. If something has failed in application side I will be missing an entry in alert table. So I’m trying to write a query to get alert_type‘s that are missing and date for which they are missing.

(SELECT
        CAST (p0.n + p1.n*2 + p2.n * POWER(2,2) + p3.n * POWER(2,3)+ p4.n * POWER(2,4)+ p5.n * POWER(2,5) AS INT) AS days
          FROM
            (SELECT 0 as n UNION SELECT 1) p0,
            (SELECT 0 as n UNION SELECT 1) p1,
            (SELECT 0 as n UNION SELECT 1) p2,
            (SELECT 0 as n UNION SELECT 1) p3,
            (SELECT 0 as n UNION SELECT 1) p4,
            (SELECT 0 as n UNION SELECT 1) p5
    WHERE (p0.n + p1.n*2 + p2.n * POWER(2,2) + p3.n * POWER(2,3)+ p4.n * POWER(2,4)+ p5.n * POWER(2,5)) <= 31
    )

Since there is an issue with generate_series running only on leader node I’m using this query to generate last 30 dates.

SELECT DATE(CURRENT_DATE - CAST ( days AS INT )) AS dt
    FROM
     (SELECT
        CAST (p0.n + p1.n*2 + p2.n * POWER(2,2) + p3.n * POWER(2,3)+ p4.n * POWER(2,4)+ p5.n * POWER(2,5) AS INT) AS days
          FROM
            (SELECT 0 as n UNION SELECT 1) p0,
            (SELECT 0 as n UNION SELECT 1) p1,
            (SELECT 0 as n UNION SELECT 1) p2,
            (SELECT 0 as n UNION SELECT 1) p3,
            (SELECT 0 as n UNION SELECT 1) p4,
            (SELECT 0 as n UNION SELECT 1) p5
    WHERE (p0.n + p1.n*2 + p2.n * POWER(2,2) + p3.n * POWER(2,3)+ p4.n * POWER(2,4)+ p5.n * POWER(2,5)) <= 31
    ) WHERE dt NOT IN (SELECT DATE(created_at) FROM alert);

The following query returns all dates that are missing but I don’t get the alerts.alert_type_id property from this.

I’m not sure how to turn this query to be able to go from alert_type to alert to basically get: alert_type.name | date_missing

Advertisement

Answer

Generate the dates and alerts and then use left join or not exists to filter out the ones with no match

with pair as (
      SELECT 0 as n UNION ALL SELECT 1
     ),
     dates as (
      SELECT DATE(CURRENT_DATE - ROW_NUMBER() OVER ()) AS dt
      FROM pair p0 CROSS JOIN 
           pair p1 CROSS JOIN 
           pair p2 CROSS JOIN 
           pair p3 CROSS JOIN 
           pair p4 CROSS JOIN 
           pair p5 
      LIMIT 31
     )
SELECT d.dt, alt.*
FROM dates d CROSS JOIN
     alert_type alt LEFT JOIN
     alerts a
     ON a.alert_type_id = alt.alert_type_id AND
        a.created_at::date = d.dt
WHERE a.alert_type_id IS NULL;

I simplified a bit the part about generating the dates.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement