Skip to content
Advertisement

MySQL FullText in Boolean mode: searching keywords that contain ‘+’ as part of the string

I have looked at dozen ‘similar’ questions, without luck. I understand that + attached to a keyword in FULLTEXT in BOOLEAN mode has a special meaning, however, what if our keyword(s) actually contain + symbol as a suffix part of the text/string. How can we still use FULL-TEXT search and get proper results?

DATABASE STRUCTURE

CREATE TABLE `ft_test` (
  `i_id` int(11) NOT NULL,
  `i_desc` mediumtext NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

ALTER TABLE `ft_test`
  ADD PRIMARY KEY (`i_id`) USING BTREE;
ALTER TABLE `ft_test` ADD FULLTEXT KEY `i_desc` (`i_desc`);

ALTER TABLE `ft_test`
  MODIFY `i_id` int(11) NOT NULL AUTO_INCREMENT;
COMMIT;

DATABASE DATA

SELECT * FROM ft_test;
+------+-----------+
| i_id | i_desc    |
+------+-----------+
|    1 | test      |
|    2 | test+     |
|    3 | test++    |
|    4 | test +    |
|    5 | test plus |
+------+-----------+

TEST #1: LIKE QUERY

SELECT * FROM ft_test WHERE i_desc LIKE 'test+%';
+------+--------+
| i_id | i_desc |
+------+--------+
|    2 | test+  |
|    3 | test++ |
+------+--------+

TEST #2: FULLTEXT QUERY

SELECT *, MATCH(`i_desc`) AGAINST ('"test+"' IN BOOLEAN MODE) AS RELEVANCE
    -> FROM `ft_test`
    -> WHERE MATCH(`i_desc`) AGAINST ('"test+"' IN BOOLEAN MODE)
    -> ORDER BY RELEVANCE;

+------+-----------+-----------+
| i_id | i_desc    | RELEVANCE |
+------+-----------+-----------+
|    1 | test      |         1 |
|    2 | test+     |         1 |
|    3 | test++    |         1 |
|    4 | test +    |         1 |
|    5 | test plus |         1 |
+------+-----------+-----------+

As you can see, LIKE query actually returned and sorted results better in this case. I have also tried using quotes for exact match, same results. Adding ‘special’ characters like ‘test+’ didn’t help, either. While, FT results are not useless, they are less than perfect, because ordering is not what I expect it to be.

QUESTION

Is it actually possible to achieve this and return same results as in LIKE using FULLTEXT mode? If yes, how?

Thanks!

Advertisement

Answer

You can only index punctuation characters if you define your column as using a collation that treats the characters as normal letters, not punctuation.

There’s a manual page that shows steps to do this: https://dev.mysql.com/doc/refman/8.0/en/full-text-adding-collation.html

But since you said your script is hard-coded, I suppose you don’t have access to redefine the collation for the column you are searching, or to restart the MySQL server after installing a custom collation definition.

A workaround is to search for the word without the punctuation, then add a condition to be applied after finding matching words.

SELECT *, MATCH(`i_desc`) AGAINST ('"test"' IN BOOLEAN MODE) AS RELEVANCE
FROM `ft_test`
WHERE MATCH(`i_desc`) AGAINST ('"test"' IN BOOLEAN MODE)
 AND i_desc LIKE 'test+%';
ORDER BY RELEVANCE;

It’ll use the fulltext index to find rows that match the word, and then the other condition term will filter against that (hopefully small) set of rows for those that have the string with the + included.

But again, if you don’t have any access to change the SQL query, this is moot.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement