Skip to content
Advertisement

SQL/Regex Challenge/Puzzle: How to remove comments from SQL code (by using SQL query)?

Requirements

  • Single-lines comments (e.g. — my comment) should be removed.
  • Multi-line comments (e.g. /* my comment */) should be removed.
  • The content of strings literals (e.g. ‘this is a multi-line comment: /* my comment */’) should be ignored.
  • The content of identifiers (e.g. “– column 1 –“) should be ignored.

literals and identifiers

Literals and identifiers can span over multiple lines

Single-line comments

A single-line comment might be the last element of the code and might not end with a newline.

Nested multi-line comments

In databases such as SQL Server and PostgreSQL, multi-line comments can be nested, e.g –

/* outer comment /* inner comment */ */

The following code is invalid since only the inner comment is closed:

/* opened outer comment /* closed inner comment */

In databases such as Teradata, Oracle, MySql and SQLite there is no concept of nested comments. The following code is invalid since the comment is already closed with the leftmost */.

/* comment /* is closed */ ERROR */

This however is a valid code:

/* comment /* still the same comment */

Advertisement

Answer

Solutions

Teradata

with t (txt) as 
(
select     '
            select    /* comment /* yada yada yada /* / // bla bla bla  
                        1
                                    */ t1.i
                   ,''"SRC''''"''    as "This''is''the
                                ''source"

            from      t1 /* "Comment 2" - '' */ cross join t2 -- /* comment 3 */

            where     t2.v = ''/*DST"*
                                /'' -- comment 4'
)

select    regexp_replace (txt,'(''.*?''|".*?")|/*.*?*/|--.*?(?=[rn]|$)','1',1,0,'n')     as clean_txt

from      t
;

Oracle

with t (txt) as 
(
select     '
            select    /* comment /* yada yada yada /* / // bla bla bla  
                        1
                                    */ t1.i
                   ,''"SRC''''"''    as "This''is''the
                                ''source"

            from      t1 /* "Comment 2" - '' */ cross join t2 -- /* comment 3 */

            where     t2.v = ''/*DST"*
                                /'' -- comment 4'

from        dual
)

select    regexp_replace (txt,'(''.*?''|".*?")|/*.*?*/|--.*?(?=$|Z)','1',1,0,'nm')

from      t
;

Result

            select     t1.i
                   ,'"SRC''"'    as "This'is'the
                                'source"

            from      t1  cross join t2 

            where     t2.v = '/*DST"*
                                /'
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement