Skip to content
Advertisement

Use of LIKE in Oracle with high number of variables

On Oracle database, I have individual queries which I would like to add based on substitution in the search word. Consider each word can have its word substitution and additional synonym substitution. For instance, the word 4MULA has word substitution such as formula, fourmula. The word VIEW has word substitution such as view, vu and Education has synonyms such as university, college, school

The structure of the where has the following order (3 words):

select * 
from dual 
where name like '%4MULA VIEW%' || ' %UNIVERSITY% ' 

Each word has to be substitute by the all possible values to replace it. I have done individual queries, but then I have an excessive amount of queries (about 500 individual queries replacing all possible substitutions). For instance, another individual query would be:

select *
from dual
where name like '%FOURMULA VU%' || ' %SCHOOL% ' 

The final number of queries depends on the total number of permutations:

FORMULA VIEW UNIVERSITY
4MULA VIEW UNIVERSITY
FOURMULA VIEW UNIVERSITY
FORMULA VU UNIVERSITY
4MULA VU UNIVERSITY
FOURMULA VU UNIVERSITY
...
FOURMULA VU SCHOOL

To optimize time I added all possible permutations using OR condition. For instance, I have added as follows:

select * 
from dual 
where name like '%4MULA VIEW%' || ' %UNIVERSITY% ' 
      OR name like '%FOURMULA VU%' || ' %SCHOOL% ' 
      ...
      OR name like '%FOURMULA VU%' || ' %SCHOOL '

What is the best way to add all permutations in a single query?

Advertisement

Answer

You can use object type and collections to create a list of synonyms for each word and then create a user-defined function to handle the substitutions:

Oracle Setup:

CREATE TYPE wordlist IS TABLE OF VARCHAR2(30);

CREATE TYPE word_synonyms IS OBJECT(
  word     VARCHAR2(30),
  synonyms wordlist
);

CREATE TYPE word_synonyms_list IS TABLE OF word_synonyms;

CREATE FUNCTION replace_synonyms(
  text         IN VARCHAR2,
  synonym_list IN word_synonyms_list
) RETURN VARCHAR2 DETERMINISTIC
IS
  p_text VARCHAR2(4000) := text;
BEGIN
  FOR i IN 1 .. synonym_list.COUNT LOOP
    FOR j IN 1 .. synonym_list(i).synonyms.COUNT LOOP
      p_text := REGEXP_REPLACE(
                  p_text,
                  '(^|[^a-z0-9])' || synonym_list(i).synonyms(j) || '($|[^a-z0-9])',
                  '1' || synonym_list(i).word || '2',
                  1,
                  0,
                  'i'
                );
      -- Handle repeated synonyms.
      p_text := REGEXP_REPLACE(
                  p_text,
                  '(^|[^a-z0-9])' || synonym_list(i).synonyms(j) || '($|[^a-z0-9])',
                  '1' || synonym_list(i).word || '2',
                  1,
                  0,
                  'i'
                );
    END LOOP;
  END LOOP;
  RETURN p_text;
END;
/

Test Data:

CREATE TABLE test_data ( text ) AS
SELECT 'FORMULA VIEW UNIVERSITY'    FROM DUAL UNION ALL
SELECT '4MULA VIEW UNIVERSITY'      FROM DUAL UNION ALL
SELECT 'FOURMULA VIEW UNIVERSITY'   FROM DUAL UNION ALL
SELECT 'FORMULA VU UNIVERSITY'      FROM DUAL UNION ALL
SELECT '4MULA VU UNIVERSITY'        FROM DUAL UNION ALL
SELECT 'FOURMULA VU UNIVERSITY'     FROM DUAL UNION ALL
SELECT 'FOURMULA VU SCHOOL'         FROM DUAL UNION ALL
SELECT 'FOURMULAE VULCAN SCHOOLING' FROM DUAL;

Query:

SELECT text,
       replace_synonyms(
         text,
         word_synonyms_list(
           word_synonyms( '4MULA',      wordlist( 'formula', 'fourmula' ) ),
           word_synonyms( 'VIEW',       wordlist( 'vu' ) ),
           word_synonyms( 'UNIVERSITY', wordlist( 'school' ) )
         )
       ) AS substitutions
FROM   test_data

outputs:

TEXT                       | SUBSTITUTIONS             
:------------------------- | :-------------------------
FORMULA VIEW UNIVERSITY    | 4MULA VIEW UNIVERSITY     
4MULA VIEW UNIVERSITY      | 4MULA VIEW UNIVERSITY     
FOURMULA VIEW UNIVERSITY   | 4MULA VIEW UNIVERSITY     
FORMULA VU UNIVERSITY      | 4MULA VIEW UNIVERSITY     
4MULA VU UNIVERSITY        | 4MULA VIEW UNIVERSITY     
FOURMULA VU UNIVERSITY     | 4MULA VIEW UNIVERSITY     
FOURMULA VU SCHOOL         | 4MULA VIEW UNIVERSITY     
FOURMULAE VULCAN SCHOOLING | FOURMULAE VULCAN SCHOOLING

and:

SELECT text
FROM   test_data
WHERE  replace_synonyms(
         text,
         word_synonyms_list(
           word_synonyms( '4MULA',      wordlist( 'formula', 'fourmula' ) ),
           word_synonyms( 'VIEW',       wordlist( 'vu' ) ),
           word_synonyms( 'UNIVERSITY', wordlist( 'school' ) )
         )
       )
       LIKE '%4MULA VIEW%%UNIVERSITY%'

outputs:

| TEXT                     |
| :----------------------- |
| FORMULA VIEW UNIVERSITY  |
| 4MULA VIEW UNIVERSITY    |
| FOURMULA VIEW UNIVERSITY |
| FORMULA VU UNIVERSITY    |
| 4MULA VU UNIVERSITY      |
| FOURMULA VU UNIVERSITY   |
| FOURMULA VU SCHOOL       |

db<>fiddle here


Currently, in test_data table I’d have just a record such as FORMULA VIEW UNIVERSITY. In addition, I have a table for word substitution and synonyms in very similar fashion you included. Considering I have just the record FORMULA VIEW UNIVERSITY how could I invoke replace_synonyms such as: SELECT text FROM test_data WHERE replace_synonyms(...)) LIKE '%4MULA VU%%UNIVERSITY%'. Notice that usually I will want to compare a new name which is not in test_data as the case of '%4MULA VU%%UNIVERSITY%'

Swap it round so that the synonyms are normalised in the LIKE filter and not in the text in the table:

CREATE TABLE test_data ( text ) AS
SELECT 'FORMULA VIEW UNIVERSITY'    FROM DUAL;

Then:

SELECT text
FROM   test_data
WHERE  text LIKE replace_synonyms(
         '%4MULA VU%%SCHOOL%',
         word_synonyms_list(
           word_synonyms( 'FORMULA',      wordlist( '4mula', 'fourmula' ) ),
           word_synonyms( 'VIEW',       wordlist( 'vu' ) ),
           word_synonyms( 'UNIVERSITY', wordlist( 'school' ) )
         )
       )

Outputs:

| TEXT                    |
| :---------------------- |
| FORMULA VIEW UNIVERSITY |

db<>fiddle here

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement