Skip to content
Advertisement

SQL join condition either A or B but not both A and B

I have sales data by years and quarters and for the last year I want to fill missing quarters with last available value.

Say we have source table:

+------+---------+-------+--------+
| year | quarter | sales | row_no |
+------+---------+-------+--------+
| 2018 |       1 |  4000 |      5 |
| 2018 |       2 |  6000 |      4 |
| 2018 |       3 |  5000 |      3 |
| 2018 |       4 |  3000 |      2 |
| 2019 |       1 |  8000 |      1 |
+------+---------+-------+--------+

Desired results:

+------+---------+-------+------------------------+
| year | quarter | sales |                        |
+------+---------+-------+------------------------+
| 2018 |       1 |  4000 |                        |
| 2018 |       2 |  6000 |                        |
| 2018 |       3 |  5000 |                        |
| 2018 |       4 |  3000 |                        |
| 2019 |       1 |  8000 |                        |
| 2019 |       2 |  8000 | <repeat the last value |
| 2019 |       3 |  8000 | <repeat the last value |
| 2019 |       4 |  8000 | <repeat the last value |
+------+---------+-------+------------------------+

So the task is to make Cartesian of year and quarter and left join to it the sales either corresponding or the last.

This code gets me almost there:

select r.year, k.quarter, t.sales
from (select distinct year        from [MyTable]) r cross join
     (select distinct quarter     from [MyTable]) k left join
     [MyTable] t
     on (r.year = t.year and k.quarter=t.quarter) or row_no=1

How to correct the last line (condition of join) so thet the 2018 is not doubled?

Advertisement

Answer

One method uses outer apply:

select y.year, q.quarter, t.sales
from (select distinct year from [MyTable]) y cross join
     (select distinct quarter from [MyTable]) q outer apply
     (select top (1) t.*
      from [MyTable] t
      where t.year < y.year or
            (t.year = y.year and t.quarter <= q.quarter)
      order by t.year desc, t.quarter desc
     ) t;

For your volume of data, this should be fine.

A more efficient method — assuming you are only assigning values to the end — would be:

select y.year, q.quarter,
       coalesce(t.sales, tdefault.sales)
from (select distinct year from [MyTable]) y cross join
     (select distinct quarter from [MyTable]) q left join
     [MyTable] t
     on t.year = y.year and
        t.quarter = q.quarter cross join
     (select top (1) t.*
      from [MyTable] t
      order by t.year desc, t.quarter desc
     ) tdefault
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement