Skip to content
Advertisement

Hierarchy Queries in BigQuery

I have a data from ControlM (scheduler) this way

PREDECESSOR_JOB_ID,SUCCESSOR_JOB_ID
XYZ,ABC
ABC,LMN
ABC,PQR
XYZ,EFG
EFG,STU
STU,TUV
STU,VWX
.
.
.

I saw many solutions where they have started from first node and descended to last node. But that is likely to fan-out with this data. What I am looking for a fan-in solution, where I can start with any end node and come towards first node

In this case the answer that am looking for is

ABC-->XYZ
LMN-->ABC-->XYZ
PQR-->ABC-->XYZ
EFG-->XYZ
STU-->EFG-->XYZ
TUV-->STU-->EFG-->XYZ
VWX-->STU-->EFG-->XYZ

Advertisement

Answer

Consider below approach

with recursive iterations as (
  select successor_job_id, predecessor_job_id, 1 pos from your_table
  union all 
  select b.successor_job_id, a.predecessor_job_id, pos + 1 
  from your_table a join iterations b
  on b.predecessor_job_id = a.successor_job_id 
)
select successor_job_id || '-->' || string_agg(predecessor_job_id, '-->' order by pos) as jobs_path
from iterations
where not successor_job_id is null
group by successor_job_id             

if applied to sample data in your question – output is

enter image description here

Advertisement