I’m using ANTLR with Presto grammar in order to parse SQL queries. This is the definition of the string I’m using:
x
STRING
: ''' ( ('\' ''') | ~''' | '''' )* '''
;
However, when I have a query like this:
select replace(name,''','')
FROM table1;
it mess things up as it parses : ”’,’ as one string.
When I used the following rule instead:
STRING
: ''' ( ('\' ''') | ~''')* '''
;
I didn’t parse correctly queries like:
SELECT * FROM table1 where col1 = 'nir''s'
which of course is a legal query.
Any idea how can I catch both?
Thanks, Nir.
Advertisement
Answer
If you want to support '
, you should not only negate the single quote, but also negate the backslash.
Something like this:
STRING
: ''' ( '\' ''' // match '
| ~[\'] // match anything other than and '
| '''' // match ''
)*
'''
;
And to account for different escaped characters, do this:
STRING
: ''' ( '\' ~[rn] // match followed by any char other than a line break
| ~[\'] // match anything other than and '
| '''' // match ''
)*
'''
;