I have some issues in getting my Vertica query to work correctly. Let’s assume I have a relation which is defined as follows:
CREATE TABLE KOMM ( MANDT VARCHAR(3), DOCNUM VARCHAR(16), COUNTER VARCHAR(3), NUM VARCHAR(6), NAM VARCHAR(30), INNUM VARCHAR(6), KOMMLEVEL VARCHAR(2), MSG VARCHAR(1000), NUM_UNH VARCHAR(6) );
and insert some example values:
insert into KOMM values ('200','45320824','000','000003','START','000002','02','START OF MESSAGE'); insert into KOMM values ('200','45320824','000','000004','INTERMED','000003','03','EXAMPLEEXAMPLEEXAMPLE'); insert into KOMM values ('200','45320824','000','000005','ADV_01','000003','03','TESTADV1'); insert into KOMM values ('200','45320824','000','000011','END','000010','04','01234567'); ... insert into KOMM values ('200','45320824','000','000022','START','000002','02','CONTINUE START OF MESSAGE'); insert into KOMM values ('200','45320824','000','000023','INTERMED','000003','03','SECONDEXAMPLEEXAMPLEEXAMPLE'); insert into KOMM values ('200','45320824','000','000024','ADV_01','000003','03','SECONDTESTADV1'); insert into KOMM values ('200','45320824','000','000030','END','000010','04','01234567');
Now, I would like to query the relation with the following query:
UPDATE KOMM E SET NUM_UNH = (SELECT MAX(X.NUM) FROM KOMM X WHERE X.NAM IN ('START') AND X.MANDT = E.MANDT AND X.DOCNUM = E.DOCNUM AND X.NUM <= E.NUM ) FROM KOMM X WHERE E.MANDT = X.MANDT AND E.DOCNUM = X.DOCNUM ;
This query, however, throws the following error:
Execution error: [Vertica]VJDBC ERROR: Non-equality correlated subquery expression is not supported
I think this is because Vertica does not allow <=, >=, < and > comparisons within subqueries? See Vertica Documentation for Subquery Restrictions
So I tried to solve it by using BETWEEN:
UPDATE KOMM E SET NUM_UNH = (SELECT max(X.NUM) FROM KOMM X WHERE X.NAM IN ('START') AND X.MANDT = E.MANDT AND X.DOCNUM = E.DOCNUM AND X.NUM BETWEEN '000000' AND (E.NUM) ) from KOMM X where E.MANDT = X.MANDT and E.DOCNUM = X.DOCNUM ;
This results in the same error:
Execution error: [Vertica]VJDBC ERROR: Non-equality correlated subquery expression is not supported
So I tried to ignore the condition and ran into another issue after executing the following query:
UPDATE KOMM E SET NUM_UNH = (SELECT max(X.NUM) FROM KOMM X WHERE X.NAM IN ('START') AND X.MANDT = E.MANDT AND X.DOCNUM = E.DOCNUM ) from KOMM X where E.MANDT = X.MANDT and E.DOCNUM = X.DOCNUM ;
Which resulted in the following error:
Execution error: [Vertica]VJDBC ERROR: Self joins in UPDATE statements are not allowed [Vertica][VJDBC]Detail: Target relation “da592a51-45ee-4d3e-9983-e8a3e56fd852_2fd1ec98-bb71-4ad0-8d33-d751e209dcdd”.KOMM also appears in the FROM list
I found a “workaround” for this issue by replacing “from KOMM X” with “from (select * from KOMM) X”. This query does execute, however not as wanted (as you can imagine). The goal is to update the table with the NUM value, until the next higher NUM value appears in the table such that the table can finally be aggregated by only displaying the rows in which the NAM is ‘START’:
SELECT M.MANDT, M.DOCNUM, M.NUM_UNH, max(case when M.NAM = 'START' then substring(cast(M.MSG as varchar(99)),15,6) end) as UNH_SEG, max(case when M.NAM = 'END' then substring(cast(M.MSG as varchar(36)),4,33) end) as PMSG from KOMM M group by M.MANDT, M.DOCNUM, NUM_UNH ;
First row of result Second row of result
Unfortuantely, I am not able to find a solution for these issues, which is why I hope that you guys can help me out. Thank you in advance for your help and suggestions!
Best regards, MoDo
Advertisement
Answer
I understand your question so that you want a report of two rows in your example, containing the NUM value of a row with nam
of ‘START’ and the last 5 characters of the message of the following row with nam
of ‘END’. The only thing I could not figure out is where in the world you got the input for the string ‘GE’ in your new column unh_seg
…
I would try to use a completely different approach. It involves GROUP-ing, OLAP functions, and nested queries.
I see that we have two groups of rows (if we order by num
) that have a sequence of one ‘START’ , one or more of something else, and one ‘END’ in the nam
column.
We need an additional column to distinguish the two groups. And Vertica has a quite unique OLAP function CONDITIONAL_TRUE_EVENT()
that comes in pretty handy here. It starts with 0 and increments by 1 every time the Boolean expression in the parentheses is true.
Let’s create and fill your table with the data types as I’d use them:
DROP TABLE IF EXISTS komm ; -- note that I use numbers, especially integers, wherever I can CREATE TABLE KOMM ( mandt INT, docnum INT, counter INT, num INT, nam VARCHAR(30), innum INT, kommlevel INT, msg VARCHAR(64) ); INSERT INTO KOMM SELECT 200,45320824,000,000003,'START', 2,2,'START OF MESSAGE' UNION ALL SELECT 200,45320824,000,000004,'INTERMED', 3,3,'EXAMPLEEXAMPLEEXAMPLE' UNION ALL SELECT 200,45320824,000,000005,'ADV_01', 3,3,'TESTADV1' UNION ALL SELECT 200,45320824,000,000011,'END', 0,4,'01234567' UNION ALL SELECT 200,45320824,000,000022,'START', 2,2,'CONTINUE START OF MESSAGE' UNION ALL SELECT 200,45320824,000,000023,'INTERMED', 3,3,'SECONDEXAMPLEEXAMPLEEXAMPLE' UNION ALL SELECT 200,45320824,000,000024,'ADV_01', 3,3,'SECONDTESTADV1' UNION ALL SELECT 200,45320824,000,000030,'END', 10,4,'01234567' ; COMMIT; SELECT * FROM komm; -- out mandt | docnum | counter | num | nam | innum | kommlevel | msg -- out -------+----------+---------+-----+----------+-------+-----------+----------------------------- -- out 200 | 45320824 | 0 | 3 | START | 2 | 2 | START OF MESSAGE -- out 200 | 45320824 | 0 | 4 | INTERMED | 3 | 3 | EXAMPLEEXAMPLEEXAMPLE -- out 200 | 45320824 | 0 | 5 | ADV_01 | 3 | 3 | TESTADV1 -- out 200 | 45320824 | 0 | 11 | END | 0 | 4 | 01234567 -- out 200 | 45320824 | 0 | 22 | START | 2 | 2 | CONTINUE START OF MESSAGE -- out 200 | 45320824 | 0 | 23 | INTERMED | 3 | 3 | SECONDEXAMPLEEXAMPLEEXAMPLE -- out 200 | 45320824 | 0 | 24 | ADV_01 | 3 | 3 | SECONDTESTADV1 -- out 200 | 45320824 | 0 | 30 | END | 10 | 4 | 01234567 -- out (8 rows)
With the table built like this, I run the query below.
WITH -- need a column to distinguish the two groups between 'START' and 'END' -- hence a nested query to generate a "session id" .. w_sess_id AS ( SELECT CONDITIONAL_TRUE_EVENT(nam='START') OVER ( ORDER BY num ) AS sess_id , * FROM komm ) SELECT mandt , docnum , MAX(CASE nam WHEN 'START' THEN num END) AS num_unh --^-- this returns NULL if nam is not 'START' , 'GE' AS unh_seg -- I have no idea where you could get this from, so I put in a constant , MAX(CASE nam WHEN 'END' THEN RIGHT(msg,5) END) AS msg FROM w_sess_id GROUP BY sess_id , mandt , docnum; -- out mandt | docnum | num_unh | unh_seg | msg -- out -------+----------+---------+---------+------- -- out 200 | 45320824 | 3 | GE | 34567 -- out 200 | 45320824 | 22 | GE | 34567 -- out (2 rows) -- out -- out Time: First fetch (2 rows): 46.210 ms. All rows formatted: 46.301 ms