Skip to content
Advertisement

Apache Phoenix SQL Join Limitation when using sub-queries

I have this query in Apache Phoenix SQL:

select WO.* from (
               select "nr_id", "txt_commrcial_label"
                 from "e_application" APP
                where "txt_commrcial_label" in ('a','b')  
                  and "nr_id" not in (select "nr_ap_id"
                                        from "e_workorder"
                                       where "nr_id" in ('888')) 
                  and "epochtimestampchanged" = (select max("epochtimestampchanged")
                                                   from "e_application" 
                                                  where "nr_id" = APP."nr_id") ) as APP2,

       (select Y.ID as WO_ID, Y."nr_id" as WO_nr_id, Y."nr_ap_id" as WO_nr_ap_id
          from ( select "nr_id", max("epochtimestampchanged") as max_epochtimestampchanged
                   from "e_workorder"
                  where CAST(TO_NUMBER("epochtimestampchanged") AS TIMESTAMP) < TO_TIMESTAMP('2020-10-21 19:22:20.0') 
               group by "nr_id" ) as X, "e_workorder" as Y
         where Y."nr_id" = X."nr_id"
           and Y."epochtimestampchanged" < X.max_epochtimestampchanged ) as WO

where APP2."nr_id" = WO.WO_nr_ap_id;

I get java language illegal ... blurb for this not overly complex statement. But I cannot see the reason here or in the manuals.

The individual queries work (imagine the ( and , are not there), but no joy when these 2 sub-queries merged to a JOIN.

Do I need to persist the results to tables and then JOIN? Or is there way around this? I have the impression this is too complex in terms of sub-queries.

Advertisement

Answer

For others to note, this is a big and a different SQL Approach is needed as per below which is a work-around with note from Cloudera:

The best workaround is to explicitly define a join in the APP2 query. See the APP_MAX_TIMESTAMP table joined with the APP table, defining basically the same condition as in the original query (but using a table join instead of an inner select):

The query that should work and should do the same as the original query:

select
   WO.*
from
   (
      select
         "nr_id",
         "txt_commrcial_label"
      from
         "e_application" APP
         LEFT JOIN (
            select
               max("epochtimestampchanged") as max_app_timestamp,
               "nr_id" as max_app_timestamp_nr_id
            from
               "e_application"
            group by "nr_id"
         ) APP_MAX_TIMESTAMP
         ON APP_MAX_TIMESTAMP.max_app_timestamp_nr_id = APP."nr_id"
      where
         "txt_commrcial_label" in
         ( list
         )
         and "nr_id" not in
         (
            select
               "nr_ap_id"
            from
               "e_workorder"
            where
               "nr_id" in
               (
                  '888'
               )
         )
         and "epochtimestampchanged" = max_app_timestamp
   )
   as APP2,
   (
      select
         Y.ID as WO_ID,
         Y."nr_id" as WO_nr_id,
         Y."nr_ap_id" as WO_nr_ap_id
      from
         (
            select
               "nr_id",
               max("epochtimestampchanged") as max_epochtimestampchanged
            from
               "e_workorder"
            where
               CAST(TO_NUMBER("epochtimestampchanged") AS TIMESTAMP) < TO_TIMESTAMP('2022-10-10 19:22:20.0')
            group by
               "nr_id"
         )
         as X,
         "e_workorder" as Y
      where
         Y."nr_id" = X."nr_id"
         and Y."epochtimestampchanged" < X.max_epochtimestampchanged
   )
   as WO
where
   APP2."nr_id" = WO.WO_nr_ap_id;
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement