Skip to content
Advertisement

Writing a Single Query w/ Multiple CTE Subqueries SQL/R

I have some data I would like to pull from a database, I’m using RStudio for my query. What I intend to do is write:

  1. The first CTE statement to pull all my necessary information.

  2. The second CTE statement will add two new columns for two row numbers, which are partitioned by different groups. Two additional columns will be added for Lead and Lag values.

  3. The third CTE will produce two more columns where the two columns use nested case_when statements to give me NewOpen and NewClosed dates.

What I have so far:

q5<- sqlQuery(ch,paste(" 
;with CTE AS
(
    select 
    oz.id as AccountID
    ,ac.PROD_TYPE_CDE as ProductTypeCode
    ,CASE WHEN ac.OPEN_DTE='0001-01-01' then null else ac.OPEN_DTE END as OpenDate
    ,CASE WHEN ac.CLOS_DTE = '0001-01-01' then null else ac.CLOS_DTE END as ClosedDate
    ,df.proc_dte as FullDate

   FROM
   dbs.tb_dbs_acct_fact df
   inner join
   dbs.tb_acct_details ac on df.dw_serv_id = ac.dw_serv_id
   left outer join
   dbs.tb_oz_id oz on df.proc_dte = oz.proc_dte 
),

cte1 as 
(
   select *
  ,row_nbr = row_number() over( partition by AccountID order by AccountID, FullDate asc )
  ,row_nbr2 = row_number() over( partition by AccountID,ProductTypeCode order by AccountID, FullDate asc )
  ,lag(ProductTypeCode) over(partition by AccountID order by FullDate asc ) as Lagging
  ,LEAD(ProductTypeCode) over(partition by AccountID order order by FullDate asc  ) as Leading
  FROM CTE   
),

cte2 as (select *
  ,case when cte1.row_nbr = 1 & cte1.Lagging=cte1.ProductTypeCode then cte1.OpenDate else
   case when cte1.Lagging<>cte1.ProductTypeCode then cte1.FullDate else NULL END END as NewOpen
  ,case when cte1.ClosedDate IS NOT NULL then cte1.ClosedDate else 
   case when cte1.Leading <> cte1.ProductTypeCode then cte1.FullDate else NULL END END as NewClosed
  FROM cte1
);"))

This code, however won’t run.

Advertisement

Answer

As mentioned, WITH is a statement to define CTEs to be used in a final query. Your query only contains CTE definitions but never actually use any in a final statement. Additionally, you can combine the first two CTEs since window functions can run at any level. Possibly the last CTE can serve as your final SELECT statement.

sql <- "WITH CTE AS
         (SELECT 
              oz.id AS AccountID
             , ac.PROD_TYPE_CDE as ProductTypeCode
             , CASE 
                    WHEN ac.OPEN_DTE='0001-01-01' 
                    THEN NULL 
                    ELSE ac.OPEN_DTE 
               END AS OpenDate
             , CASE 
                    WHEN ac.CLOS_DTE = '0001-01-01' 
                    THEN NULL 
                    ELSE ac.CLOS_DTE
               END AS ClosedDate
             , df.proc_dte AS FullDate

             , ROW_NUMBER() OVER (PARTITION BY oz.id 
                                  ORDER BY oz.id, df.proc_dte) AS row_nbr 
             , ROW_NUMBER() OVER (PARTITION BY oz.id, ac.PROD_TYPE_CDE 
                                  ORDER BY oz.id, df.proc_dte) AS row_nbr2
             , LAG(ac.PROD_TYPE_CDE) OVER (PARTITION BY oz.id 
                                           ORDER BY df.proc_dte) AS Lagging
             , LEAD(ac.PROD_TYPE_CDE) OVER (PARTITION BY oz.id
                                            ORDER BY df.proc_dte) AS Leading
          FROM
              dbs.tb_dbs_acct_fact df
          INNER JOIN
              dbs.tb_acct_details ac ON df.dw_serv_id = ac.dw_serv_id
          LEFT OUTER JOIN
              dbs.tb_oz_id oz ON df.proc_dte = oz.proc_dte 
         )

      SELECT *
          , CASE 
                WHEN row_nbr = 1 & Lagging = ProductTypeCode 
                THEN OpenDate 
                ELSE
                    CASE
                        WHEN Lagging <> ProductTypeCode 
                        THEN FullDate 
                        ELSE NULL 
                    END 
            END AS NewOpen
          , CASE 
                WHEN ClosedDate IS NOT NULL 
                THEN ClosedDate 
                ELSE 
                    CASE 
                        WHEN Leading <> ProductTypeCode 
                        THEN FullDate 
                        ELSE NULL 
                    END 
            END AS NewClosed
      FROM CTE;"

q5 <- sqlQuery(ch, sql)
3 People found this is helpful
Advertisement