On Oracle database, I have individual queries which I would like to add based on substitution in the search word. Consider each word can have its word substitution and additional synonym substitution. For instance, the word 4MULA has word substitution such as formula, fourmula. The word VIEW has word substitution such as view, vu and Education has synonyms such as university, college, school
The structure of the where has the following order (3 words):
select * from dual where name like '%4MULA VIEW%' || ' %UNIVERSITY% '
Each word has to be substitute by the all possible values to replace it. I have done individual queries, but then I have an excessive amount of queries (about 500 individual queries replacing all possible substitutions). For instance, another individual query would be:
select * from dual where name like '%FOURMULA VU%' || ' %SCHOOL% '
The final number of queries depends on the total number of permutations:
FORMULA VIEW UNIVERSITY 4MULA VIEW UNIVERSITY FOURMULA VIEW UNIVERSITY FORMULA VU UNIVERSITY 4MULA VU UNIVERSITY FOURMULA VU UNIVERSITY ... FOURMULA VU SCHOOL
To optimize time I added all possible permutations using OR condition. For instance, I have added as follows:
select * from dual where name like '%4MULA VIEW%' || ' %UNIVERSITY% ' OR name like '%FOURMULA VU%' || ' %SCHOOL% ' ... OR name like '%FOURMULA VU%' || ' %SCHOOL '
What is the best way to add all permutations in a single query?
Advertisement
Answer
You can use object type and collections to create a list of synonyms for each word and then create a user-defined function to handle the substitutions:
Oracle Setup:
CREATE TYPE wordlist IS TABLE OF VARCHAR2(30); CREATE TYPE word_synonyms IS OBJECT( word VARCHAR2(30), synonyms wordlist ); CREATE TYPE word_synonyms_list IS TABLE OF word_synonyms; CREATE FUNCTION replace_synonyms( text IN VARCHAR2, synonym_list IN word_synonyms_list ) RETURN VARCHAR2 DETERMINISTIC IS p_text VARCHAR2(4000) := text; BEGIN FOR i IN 1 .. synonym_list.COUNT LOOP FOR j IN 1 .. synonym_list(i).synonyms.COUNT LOOP p_text := REGEXP_REPLACE( p_text, '(^|[^a-z0-9])' || synonym_list(i).synonyms(j) || '($|[^a-z0-9])', '1' || synonym_list(i).word || '2', 1, 0, 'i' ); -- Handle repeated synonyms. p_text := REGEXP_REPLACE( p_text, '(^|[^a-z0-9])' || synonym_list(i).synonyms(j) || '($|[^a-z0-9])', '1' || synonym_list(i).word || '2', 1, 0, 'i' ); END LOOP; END LOOP; RETURN p_text; END; /
Test Data:
CREATE TABLE test_data ( text ) AS SELECT 'FORMULA VIEW UNIVERSITY' FROM DUAL UNION ALL SELECT '4MULA VIEW UNIVERSITY' FROM DUAL UNION ALL SELECT 'FOURMULA VIEW UNIVERSITY' FROM DUAL UNION ALL SELECT 'FORMULA VU UNIVERSITY' FROM DUAL UNION ALL SELECT '4MULA VU UNIVERSITY' FROM DUAL UNION ALL SELECT 'FOURMULA VU UNIVERSITY' FROM DUAL UNION ALL SELECT 'FOURMULA VU SCHOOL' FROM DUAL UNION ALL SELECT 'FOURMULAE VULCAN SCHOOLING' FROM DUAL;
Query:
SELECT text, replace_synonyms( text, word_synonyms_list( word_synonyms( '4MULA', wordlist( 'formula', 'fourmula' ) ), word_synonyms( 'VIEW', wordlist( 'vu' ) ), word_synonyms( 'UNIVERSITY', wordlist( 'school' ) ) ) ) AS substitutions FROM test_data
outputs:
TEXT | SUBSTITUTIONS :------------------------- | :------------------------- FORMULA VIEW UNIVERSITY | 4MULA VIEW UNIVERSITY 4MULA VIEW UNIVERSITY | 4MULA VIEW UNIVERSITY FOURMULA VIEW UNIVERSITY | 4MULA VIEW UNIVERSITY FORMULA VU UNIVERSITY | 4MULA VIEW UNIVERSITY 4MULA VU UNIVERSITY | 4MULA VIEW UNIVERSITY FOURMULA VU UNIVERSITY | 4MULA VIEW UNIVERSITY FOURMULA VU SCHOOL | 4MULA VIEW UNIVERSITY FOURMULAE VULCAN SCHOOLING | FOURMULAE VULCAN SCHOOLING
and:
SELECT text FROM test_data WHERE replace_synonyms( text, word_synonyms_list( word_synonyms( '4MULA', wordlist( 'formula', 'fourmula' ) ), word_synonyms( 'VIEW', wordlist( 'vu' ) ), word_synonyms( 'UNIVERSITY', wordlist( 'school' ) ) ) ) LIKE '%4MULA VIEW%%UNIVERSITY%'
outputs:
| TEXT | | :----------------------- | | FORMULA VIEW UNIVERSITY | | 4MULA VIEW UNIVERSITY | | FOURMULA VIEW UNIVERSITY | | FORMULA VU UNIVERSITY | | 4MULA VU UNIVERSITY | | FOURMULA VU UNIVERSITY | | FOURMULA VU SCHOOL |
db<>fiddle here
Currently, in test_data table I’d have just a record such as
FORMULA VIEW UNIVERSITY
. In addition, I have a table for word substitution and synonyms in very similar fashion you included. Considering I have just the recordFORMULA VIEW UNIVERSITY
how could I invoke replace_synonyms such as:SELECT text FROM test_data WHERE replace_synonyms(...)) LIKE '%4MULA VU%%UNIVERSITY%'
. Notice that usually I will want to compare a new name which is not in test_data as the case of'%4MULA VU%%UNIVERSITY%'
Swap it round so that the synonyms are normalised in the LIKE
filter and not in the text in the table:
CREATE TABLE test_data ( text ) AS SELECT 'FORMULA VIEW UNIVERSITY' FROM DUAL;
Then:
SELECT text FROM test_data WHERE text LIKE replace_synonyms( '%4MULA VU%%SCHOOL%', word_synonyms_list( word_synonyms( 'FORMULA', wordlist( '4mula', 'fourmula' ) ), word_synonyms( 'VIEW', wordlist( 'vu' ) ), word_synonyms( 'UNIVERSITY', wordlist( 'school' ) ) ) )
Outputs:
| TEXT | | :---------------------- | | FORMULA VIEW UNIVERSITY |
db<>fiddle here