Oracle SQL – Splitting string every 70 characters without splitting a word and deriving sequence number

Question

I currently working on a table on Oracle and I need to split one its field (RH02X) into multiple rows when it the field exceeds 70 characters but I also need to make sure that no word would be cut. I am only allowed to use SQL without PL/SQL. For example, I have the following data from the table test_remarks

Accepted Answer

You can use recursive subquery factoring to identify the first point to split the string (if needed) in the anchor member, and pass the remainder of the string to the recursive member to repeat the process.This uses instr() to find the last space character in the first 70 characters, as pos &#8211; if there is no space or the string is less than 70 characters that will be zero. The pos value is then used to split the string if it&#8217;s long enough and can be split, or take up to 70 characters otherwise.with rcte (refno, rh01d, rh02x, rh03x, rn1, rn2, line_num, remaining) as (  select refno,    rh01d,    substr(rh02x, 1, case when pos = 0 then 70 else pos - 1 end),    rh03x,    row_number() over (order by refno, rh01d, rh03x), -- preferably also some unique column?    dense_rank() over (order by refno, rh01d, rh03x), -- preferably also some unique column?    1,    substr(rh02x, case when pos = 0 then 70 else pos end + 1)  from test_remarks  cross apply (    select case when length(rh02x) <= 70 then 0      else instr(substr(rh02x, 1, 70), ' ', -1) end as pos    from dual  )  union all  select refno,    rh01d,    substr(remaining, 1, case when pos = 0 then 70 else pos - 1 end),    rh03x,    rn1,    rn2,    line_num + 1,    substr(remaining, case when pos = 0 then 70 else pos end + 1)  from rcte  cross apply (    select case when length(remaining) <= 70 then 0      else instr(substr(remaining, 1, 70), ' ', -1) end as pos    from dual  )  where remaining is not null)cycle remaining set is_cycle to 'Y' default 'N'select refno, rh01d, rh02x, rh03x,  dense_rank() over (partition by rn2 order by rn1, line_num) as seq,  length(rh02x) as length_rh02x, rn1, rn2, line_num -- for info onlyfrom rcteorder by rn1, line_num;Oracle 12c is required because of the cross apply; it will be (even) messier on earlier versions.The recursive member is essentially doing the same as the anchor member, except the anchor is manipulating the original string from the table while the recursive is manipulating the remainder from the previous iteration.The rn1, rn2 and line_num are used to track which original value is being referred to and which segment this is. They are set in the anchor member, and kept (rn1 and rn2) or incremented (line_num) in the recursive member. Those are used to order the output so related lines are together and in the same order; and also used to work out the seq value. I&#8217;ve left those values in the output so you can see what&#8217;s happening, but they don&#8217;t need to be included in your final query.REFNORH01DRH02XRH03XSEQLENGTH_RH02XRN1RN2LINE_NUM000115608-DEC-10Document delete Description: FW Test Payment DHIS ID: 10616132DOCD162111000115608-DEC-10DOCL.source: EMAIL DOCL.sourcid: 10616132 DOCL.imageid: 26724898DOCD264112000115608-DEC-10Document delete Description: RE Test Payment DHIS ID: 10619470DOCD362211000115608-DEC-10DOCL.source: EMAIL DOCL.sourcid: 10619470 DOCL.imageid: 26733538DOCD464212db<>fiddle with some extra data to show edge cases, and some of the intermediate steps which might clarify things a bit.The cycle clause probably shouldn&#8217;t be necessary here, as there are no cycles (as far as I can see) and it evaluates to N for all rows. But it errors without it, throwing &#8220;ORA-32044: cycle detected while executing recursive WITH query&#8221;. This may be related to ANSI behaviour, as in this recent question.

REFNO	RH01D	RH02X	RH03X	SEQNO	LENGTH_RH02X
0001156	08-DEC-10	Document delete Description: FW Test Payment DHIS ID: 10616132 DOCL.so	DOCD	1	70
0001156	08-DEC-10	urce: EMAIL DOCL.sourcid: 10616132 DOCL.imageid: 26724898	DOCD	2	57
0001156	08-DEC-10	Document delete Description: RE Test Payment DHIS ID: 10619470 DOCL.so	DOCD	1	70
0001156	08-DEC-10	urce: EMAIL DOCL.sourcid: 10619470 DOCL.imageid: 26733538	DOCD	2	57

Advertisement

Answer