I have one query that returns the following HTML element:
x
<span CreatedFromTXTextControl="1" style="font-family:Tahoma;font-size:8pt;">
<p lang="en-US" style="text-indent:0pt;margin-left:0pt;margin-top:0pt;margin-bottom:6pt;margin-right:0pt;line-height:100%;" xmlns="http://www.w3.org/1999/xhtml"><span style="font-family:'Tahoma';font-size:8pt;">I am not sure how to extract this text using SQL.</span></p>
</span>
My query currently looks something like this:
SELECT text
FROM MyTable
How shall I change this query to return only the text inside the span
element shown above?
In the example above the result should be the string:
I am not sure how to extract this text using SQL.
.
DBMS implementation: MS SQL Server.
Advertisement
Answer
Try this:
DECLARE @Data table ( html varchar(4000) );
INSERT INTO @Data VALUES (
'<span CreatedFromTXTextControl="1" style="font-family:Tahoma;font-size:8pt;">
<p lang="en-US" style="text-indent:0pt;margin-left:0pt;margin-top:0pt;margin-bottom:6pt;margin-right:0pt;line-height:100%;" xmlns="http://www.w3.org/1999/xhtml">
<span style="font-family:Tahoma;font-size:8pt;">I am not sure how to extract this text using SQL.</span>
</p>
</span>'
);
SELECT
t.f.value( '.', 'varchar(255)' ) AS span_value
FROM @Data AS d
CROSS APPLY (
SELECT CAST ( d.html AS xml ) AS h
) AS x
CROSS APPLY x.h.nodes( '//span/*' ) t(f);
Returns
+---------------------------------------------------+
| span_value |
+---------------------------------------------------+
| I am not sure how to extract this text using SQL. |
+---------------------------------------------------+
Using SQL Server’s XML datatype can get you what you need–assuming your HTML is valid.