I have one query that returns the following HTML element:
<span CreatedFromTXTextControl="1" style="font-family:Tahoma;font-size:8pt;"> <p lang="en-US" style="text-indent:0pt;margin-left:0pt;margin-top:0pt;margin-bottom:6pt;margin-right:0pt;line-height:100%;" xmlns="http://www.w3.org/1999/xhtml"><span style="font-family:'Tahoma';font-size:8pt;">I am not sure how to extract this text using SQL.</span></p> </span>
My query currently looks something like this:
SELECT text FROM MyTable
How shall I change this query to return only the text inside the span
element shown above?
In the example above the result should be the string:
I am not sure how to extract this text using SQL.
.
DBMS implementation: MS SQL Server.
Advertisement
Answer
Try this:
DECLARE @Data table ( html varchar(4000) ); INSERT INTO @Data VALUES ( '<span CreatedFromTXTextControl="1" style="font-family:Tahoma;font-size:8pt;"> <p lang="en-US" style="text-indent:0pt;margin-left:0pt;margin-top:0pt;margin-bottom:6pt;margin-right:0pt;line-height:100%;" xmlns="http://www.w3.org/1999/xhtml"> <span style="font-family:Tahoma;font-size:8pt;">I am not sure how to extract this text using SQL.</span> </p> </span>' ); SELECT t.f.value( '.', 'varchar(255)' ) AS span_value FROM @Data AS d CROSS APPLY ( SELECT CAST ( d.html AS xml ) AS h ) AS x CROSS APPLY x.h.nodes( '//span/*' ) t(f);
Returns
+---------------------------------------------------+ | span_value | +---------------------------------------------------+ | I am not sure how to extract this text using SQL. | +---------------------------------------------------+
Using SQL Server’s XML datatype can get you what you need–assuming your HTML is valid.