I have looked at dozen ‘similar’ questions, without luck. I understand that +
attached to a keyword in FULLTEXT in BOOLEAN mode has a special meaning,
however, what if our keyword(s) actually contain +
symbol as a suffix part of the text/string. How can we still use FULL-TEXT search and get proper results?
DATABASE STRUCTURE
CREATE TABLE `ft_test` ( `i_id` int(11) NOT NULL, `i_desc` mediumtext NOT NULL ) ENGINE=MyISAM DEFAULT CHARSET=utf8; ALTER TABLE `ft_test` ADD PRIMARY KEY (`i_id`) USING BTREE; ALTER TABLE `ft_test` ADD FULLTEXT KEY `i_desc` (`i_desc`); ALTER TABLE `ft_test` MODIFY `i_id` int(11) NOT NULL AUTO_INCREMENT; COMMIT;
DATABASE DATA
SELECT * FROM ft_test; +------+-----------+ | i_id | i_desc | +------+-----------+ | 1 | test | | 2 | test+ | | 3 | test++ | | 4 | test + | | 5 | test plus | +------+-----------+
TEST #1: LIKE QUERY
SELECT * FROM ft_test WHERE i_desc LIKE 'test+%'; +------+--------+ | i_id | i_desc | +------+--------+ | 2 | test+ | | 3 | test++ | +------+--------+
TEST #2: FULLTEXT QUERY
SELECT *, MATCH(`i_desc`) AGAINST ('"test+"' IN BOOLEAN MODE) AS RELEVANCE -> FROM `ft_test` -> WHERE MATCH(`i_desc`) AGAINST ('"test+"' IN BOOLEAN MODE) -> ORDER BY RELEVANCE; +------+-----------+-----------+ | i_id | i_desc | RELEVANCE | +------+-----------+-----------+ | 1 | test | 1 | | 2 | test+ | 1 | | 3 | test++ | 1 | | 4 | test + | 1 | | 5 | test plus | 1 | +------+-----------+-----------+
As you can see, LIKE query actually returned and sorted results better in this case. I have also tried using quotes for exact match, same results. Adding ‘special’ characters like ‘test+’ didn’t help, either. While, FT results are not useless, they are less than perfect, because ordering is not what I expect it to be.
QUESTION
Is it actually possible to achieve this and return same results as in LIKE using FULLTEXT mode? If yes, how?
Thanks!
Advertisement
Answer
You can only index punctuation characters if you define your column as using a collation that treats the characters as normal letters, not punctuation.
There’s a manual page that shows steps to do this: https://dev.mysql.com/doc/refman/8.0/en/full-text-adding-collation.html
But since you said your script is hard-coded, I suppose you don’t have access to redefine the collation for the column you are searching, or to restart the MySQL server after installing a custom collation definition.
A workaround is to search for the word without the punctuation, then add a condition to be applied after finding matching words.
SELECT *, MATCH(`i_desc`) AGAINST ('"test"' IN BOOLEAN MODE) AS RELEVANCE FROM `ft_test` WHERE MATCH(`i_desc`) AGAINST ('"test"' IN BOOLEAN MODE) AND i_desc LIKE 'test+%'; ORDER BY RELEVANCE;
It’ll use the fulltext index to find rows that match the word, and then the other condition term will filter against that (hopefully small) set of rows for those that have the string with the +
included.
But again, if you don’t have any access to change the SQL query, this is moot.