Skip to content
Advertisement

SQL: read text inside the span HTML element returned by a query

I have one query that returns the following HTML element:

<span CreatedFromTXTextControl="1" style="font-family:Tahoma;font-size:8pt;">
<p lang="en-US" style="text-indent:0pt;margin-left:0pt;margin-top:0pt;margin-bottom:6pt;margin-right:0pt;line-height:100%;" xmlns="http://www.w3.org/1999/xhtml"><span style="font-family:'Tahoma';font-size:8pt;">I am not sure how to extract this text using SQL.</span></p>
</span>

My query currently looks something like this:

SELECT text
FROM MyTable

How shall I change this query to return only the text inside the span element shown above?

In the example above the result should be the string: I am not sure how to extract this text using SQL..

DBMS implementation: MS SQL Server.

Advertisement

Answer

Try this:

DECLARE @Data table ( html varchar(4000) );
INSERT INTO @Data VALUES ( 
    '<span CreatedFromTXTextControl="1" style="font-family:Tahoma;font-size:8pt;">
      <p lang="en-US" style="text-indent:0pt;margin-left:0pt;margin-top:0pt;margin-bottom:6pt;margin-right:0pt;line-height:100%;" xmlns="http://www.w3.org/1999/xhtml">
        <span style="font-family:Tahoma;font-size:8pt;">I am not sure how to extract this text using SQL.</span>
      </p>
     </span>'
);

SELECT
    t.f.value( '.', 'varchar(255)' ) AS span_value
FROM @Data AS d
CROSS APPLY (
    SELECT CAST ( d.html AS xml ) AS h
) AS x
CROSS APPLY x.h.nodes( '//span/*' ) t(f);

Returns

+---------------------------------------------------+
|                   span_value                      |
+---------------------------------------------------+
| I am not sure how to extract this text using SQL. |
+---------------------------------------------------+

Using SQL Server’s XML datatype can get you what you need–assuming your HTML is valid.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement