I am reading in an XML file into the database. It works fine however it is very slow. I was wondering if I can use text()
in order to speed the process up similarly to the solution in this post: Xml select query xpath is slow.
My problem however is that the solution is for XML with element-centric mapping while I have attribute-centric mapping so I’m not sure how to use text()
in my case.
The XML looks like this
<Level1> <Level2 name="l2_name"> <Level3 name="l3_name"> <Level4 name="l4_name"> <Level5 name="l5_name"> <Level6_1 name="l6_1_name" value="l6_1_value"/> <Level6_2 name="l6_2_name" value="l6_2_value"/>
I am using the following code
declare @x xml select @x = c1 from openrowset(bulk 'd:file.xml', single_blob) as table1(c1); select c.value('@name', 'varchar(20)') as [col1], c2.value('@name', 'varchar(20)') as [col2], c3.value('@name', 'varchar(20)') as [col3], c4.value('@name', 'varchar(20)') as [col4], c5_1.value('@name', 'varchar(20)') as [col5], c5_1.value('@value', 'varchar(20)') as [col6], c5_2.value('@name', 'varchar(20)') as [col7], c5_2.value('@value', 'varchar(20)') as [col8], from @x.nodes('Level1/Level2') as t(c) cross apply c.nodes('Level3') as t2(c2) cross apply c2.nodes('Level4') as t3(c3) cross apply c3.nodes('Level5') as t4(c4) cross apply c4.nodes('Level6_1') as t5_1(c5_1) cross apply c4.nodes('Level6_2') as t5_2(c5_2)
The file I am reading is 13MB and it takes 3.5 hours to read it in.
Advertisement
Answer
Without being able to test on so many rows as you have, I’ll give you this script to try out (fiddle<>):
DECLARE @x XML=N' <Level1><Level2 name="l2_name"><Level3 name="l3_name"> <Level4 name="l4_name"><Level5 name="l5_name"> <Level6 name="l6_name" value="l6_value"/> <Level6 name="l6_name" value="l6_value"/> </Level5></Level4></Level3></Level2></Level1>'; SELECT x.n.value('../../../../@name', 'varchar(20)') as [col1], x.n.value('../../../@name', 'varchar(20)') as [col2], x.n.value('../../@name', 'varchar(20)') as [col3], x.n.value('../@name', 'varchar(20)') as [col4], x.n.value('./@name', 'varchar(20)') as [col5] FROM @x.nodes('/Level1/Level2/Level3/Level4/Level5/Level6') AS x(n);
This would explicitly query for Level6
nodes, then backtrack for the parents’ attribute values. This is quite likely faster than cross applying queries for each individual LevelN
element.
Updated, for elements in level 6 with different names and assuming only one such element of each name appears as a child of level 5:
DECLARE @x_2 XML=N'<Level1> <Level2 name="l2_name"> <Level3 name="l3_name"> <Level4 name="l4_name"> <Level5 name="l5_name"> <Level6_1 name="l6_1_name" value="l6_1_value"/> <Level6_2 name="l6_2_name" value="l6_2_value"/> </Level5></Level4></Level3></Level2></Level1>'; SELECT x.n.value('../../../@name', 'varchar(20)') as [col1], x.n.value('../../@name', 'varchar(20)') as [col2], x.n.value('../@name', 'varchar(20)') as [col3], x.n.value('./@name', 'varchar(20)') as [col4], x.n.value('(./Level6_1/@name)[1]', 'varchar(20)') as [col5], x.n.value('(./Level6_2/@name)[1]', 'varchar(20)') as [col6] FROM @x_2.nodes('/Level1/Level2/Level3/Level4/Level5') AS x(n);
This selects nodes on level 5, backtracks to parents for parent attributes, then selects child elements based on name. Selects the first such element using [1]
selector.