Skip to content
Advertisement

temporal joins in hierarchical query

I want to join various nodes of a tree, making sure the the returned root-to-leaf path is temporally valid. The tricky part is that the data source is dated with validity from-to dates.

ID NVALUE VFROM VTO
1 A 2021-01-01 2021-01-31
1 B 2021-02-01 2021-02-28
2 C 2021-01-01 2021-02-28
3 D 2021-01-01 2021-01-31
3 E 2021-02-01 2021-02-28

the links are trivially pointing to the node ids (but not their dates!)

LINK_CHILD LINK_PARENT
1 2
2 3

from this I would like to return the valid paths and their validity dates:

  1. A-C-D valid from 2021-01-01 to 2021-01-31
  2. B-C-E valid from 2021-02-01 to 2021-02-28

invalid paths (e.g. A-C-E should not be returned, since there is no moment in time in which all the three nodes are valid).

The issue I have with this is that the “overlap” check is not transitive (so A overlaps with B and B overlaps with C does not imply that A overlaps with C). So when writing the connect by query each level overlaps with the next, but the resulting global path is invalid.

the basic query set up I have is

Advertisement

Answer

I believe this is the most efficient way. One recursive with query and another trivial query to get the leaves.

here’s an example with a more complex data source:

dbfiddle

I have now tested this approach against our data which have 300K nodes and 240K links, and the trees (plus some additional pivoting) are parsed in 6 seconds. Similar work was done in 10 minutes by the ETLs.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement