Let’s say I have this type of dataset:
row Num_A Num_B Num_C Reg_1 Reg_2 ------ ----- ----- ----- ----- ----- 1 9400 9500 9700 AAA CCC 2 9700 9400 9500 BBB DDD 3 9500 9400 9200 EEE AAA 4 9100 9300 9900 AAA DDD
Consider the following conditions:
In row 1, Num_B
"95000"
appears in Num_A
(row 3) and the correspondent Num_A
(row 1) "94000"
is equal to its Num_B
(row 3) (which is the case `”94000″)
So the new column Comm_1
will have the combination of Num_A
and Num_B
as its value. Same goes for Comm_2
As for Comm_3
, the condition are same as Comm_1
or Comm_2
, and an additional condition that Reg_1
(or Reg_2
) of row 1 'AAA'
(or 'CCC'
) is equal to Reg_1
or Reg_2
of row 3 (which is the case).
I want to construct a query so that the result would look like this:
row Num_A Num_B Num_C Reg_1 Reg_2 Comm_1 Comm_2 Comm_3 ------ ----- ----- ----- ----- ----- ----- ----- ----- 1 9400 9500 9700 AAA CCC 94009500 94009700 94009500 2 9700 9400 9500 BBB DDD 97009400 97009500 NULL 3 9500 9400 9200 EEE AAA 95009400 NULL 95009400 4 9100 9300 9900 AAA DDD NULL NULL NULL
I tried the following basic syntax but it isn’t working:
SELECT IF(Num_B in (select distinct Num_A from mydata) , concat(Num_A,Num_B), Null) as Comm_1, IF(Num_C in (select distinct Num_A from mydata) , concat(Num_A,Num_C), Null) as Comm_2 FROM mydata
So please provide any hint and point me to reading about the right point in pl/sql
or what ever procedure I need to learn.
Advertisement
Answer
You can get what (I think) you want with outer self joins:
-- CTE for sample data, including row_num to maintain display order with mydata (row_num, Num_A, Num_B, Num_C, Reg_1, Reg_2) as ( select 1, 94000, 95000, 97000, 'AAA', 'CCC' from dual union all select 2, 97000, 94000, 95000, 'BBB', 'DDD' from dual union all select 3, 95000, 94000, 92000, 'EEE', 'AAA' from dual union all select 4, 91000, 93000, 99000, 'AAA', 'DDD' from dual ) -- actual query select m.*, case when m1.num_a is not null then m.num_a || m1.num_a end as comm_1, case when m2.num_a is not null then m.num_a || m2.num_a end as comm_2, case when m3.num_a is not null then m.num_a || m3.num_a end as comm_3 from mydata m left join mydata m1 on m1.num_a = m.num_b left join mydata m2 on m2.num_a = m.num_c left join mydata m3 on m3.num_a = m.num_b and (m3.reg_1 = m.reg_2 or m3.reg_2 = m.reg_1) order by m.row_num; ROW_NUM NUM_A NUM_B NUM_C REG REG COMM_1 COMM_2 COMM_3 ---------- ---------- ---------- ---------- --- --- ---------- ---------- ---------- 1 94000 95000 97000 AAA CCC 9400095000 9400097000 9400095000 2 97000 94000 95000 BBB DDD 9700094000 9700095000 3 95000 94000 92000 EEE AAA 9500094000 9500094000 4 91000 93000 99000 AAA DDD
(I’ve left in row_num
just to keep the display order the same, it isn’t used otherwise; I assume that isn’t actually a column in your table.)
The comm_1
value is based on a left join to another row which has its num_a
matching this row’s num_b
.
The comm_2
value is based on a left join to another row which has its num_a
matching this row’s num_c
.
The comm_3
value is the same as comm_1
but has to also match its reg_1
with the other row’s reg_2
or vice versa. But, this calculates row 3’s value as 9500094000 rather than the 94009500 shown in the question – not clear which is right though.
This will also get duplicate rows if there is more than one match for any of the outer joins; that isn’t the case with your sample data, but something to watch out for. I have no idea how you would want to handle that if it can happen.
You could also do it with a single outer join and more logic/aggregation in the column expressions:
select m.row_num, m.num_a, m.num_b, m.num_c, m.reg_1, m.reg_2, max(case when m1.num_a = m.num_b then m.num_a || m1.num_a end) as comm_1, max(case when m1.num_a = m.num_c then m.num_a || m1.num_a end) as comm_2, max(case when m1.num_a = m.num_b and (m1.reg_1 = m.reg_2 or m1.reg_2 = m.reg_1) then m.num_a || m1.num_a end) as comm_3 from mydata m left join mydata m1 on (m1.num_a = m.num_b) or (m1.num_a = m.num_c) group by m.row_num, m.num_a, m.num_b, m.num_c, m.reg_1, m.reg_2 order by m.row_num; ROW_NUM NUM_A NUM_B NUM_C REG REG COMM_1 COMM_2 COMM_3 ---------- ---------- ---------- ---------- --- --- ---------- ---------- ---------- 1 94000 95000 97000 AAA CCC 9400095000 9400097000 9400095000 2 97000 94000 95000 BBB DDD 9700094000 9700095000 3 95000 94000 92000 EEE AAA 9500094000 9500094000 4 91000 93000 99000 AAA DDD