I’m using ANTLR with Presto grammar in order to parse SQL queries. This is the definition of the string I’m using:
STRING : ''' ( ('\' ''') | ~''' | '''' )* ''' ;
However, when I have a query like this:
select replace(name,''','') FROM table1;
it mess things up as it parses : ”’,’ as one string.
When I used the following rule instead:
STRING : ''' ( ('\' ''') | ~''')* ''' ;
I didn’t parse correctly queries like:
SELECT * FROM table1 where col1 = 'nir''s'
which of course is a legal query.
Any idea how can I catch both?
Thanks, Nir.
Advertisement
Answer
If you want to support '
, you should not only negate the single quote, but also negate the backslash.
Something like this:
STRING : ''' ( '\' ''' // match ' | ~[\'] // match anything other than and ' | '''' // match '' )* ''' ;
And to account for different escaped characters, do this:
STRING : ''' ( '\' ~[rn] // match followed by any char other than a line break | ~[\'] // match anything other than and ' | '''' // match '' )* ''' ;