I am trying to find the most frequent string ignoring everything after (
within it.
So, how it should work. If I’ve got the strings:
England (88) Iceland (100) Iceland (77) England (88) Denmark (15) Iceland (18)
It should return
Iceland
because it’s the most frequent country here and no matter that as a string England (88) is going to pretend.
Unfortunately, my query returns
England(88)
I’ve been thinking to do it by 2 steps:
- truncate every country string
- do script that I already written.
But I failed on the first step.
Advertisement
Answer
SQL Fiddle is acting up, so can’t test, but I’d think you could use SUBSTR()
and INSTR()
to isolate the portion left of the first (
:
SELECT SUBSTR(X,1,INSTR(X,'(')-1) AS HUS FROM tt GROUP BY SUBSTR(X,1,INSTR(X,'(')-1) ORDER BY COUNT(*) DESC LIMIT 1;
Edit: Tested on https://sqliteonline.com/ and it returns Iceland
as expected: Fiddle.