Skip to content
Advertisement

How to omit html tags in a mysql table attribute while doing a select

I have a table where each row consist of an attribute which consist of html data with like this.

<div className="single_line"><p>New note example</p></div>

I need to omit the html tags and extract only the data inside the tags using sql query. Any idea on how to achieve this?. I tried out different regex but they didnt work.

Advertisement

Answer

There are 2 solutions based on mysql version.

If you are using MySQL 8.0 then you can use REGEXP_REPLACE() directly inside the select statement.

SELECT REGEXP_REPLACE('<div><p>New note example</p></div>', '(<[^>]*>)|(&nbsp;)', '');

If you are using MySQL 5.7 then you have to create a user define function in database to strip html tags.

DROP FUNCTION IF EXISTS fn_strip_html_tags;
    
CREATE FUNCTION fn_strip_html_tags( html_text TEXT ) RETURNS TEXT  
BEGIN  
     DECLARE start,end INT DEFAULT 1; 
     DECLARE text_without_nbsp TEXT;
     LOOP
        SET start = LOCATE("<", html_text, start);
        IF (!start) THEN RETURN html_text; END IF;
        SET end = LOCATE(">", html_text, start);
        IF (!end) THEN SET end = start; END IF;
        SET text_without_nbsp = REPLACE(html_text, "&nbsp;", " ");
        SET html_text = INSERT(text_without_nbsp, start, end - start + 1, "");
            END LOOP;
END

For example

SELECT fn_strip_html_tags('<div><p>New note example</p></div>');
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement