Skip to content
Advertisement

In Hive, how to read through NULL / empty tags present within an XML using explode(XPATH(..)) function?

In below Hive-query, I need to read the null / empty “string” tags as well, from the XML content. Only the non-null “string” tags are getting considered within the XPATH() list now.

Expected output from query:

Advertisement

Answer

The problem here is that XPATH returns NodeList and if it contains empty node, it is not included in the list.

Concatenation with some string (in XPATH): concat(/Value/string/text()," ") does not work here:

Caused by: javax.xml.xpath.XPathExpressionException: com.sun.org.apache.xpath.internal.XPathException: Can not convert #STRING to a NodeList!

at com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.evaluate(XPathExpressionImpl.java:195)

Easy solution is to replace <string></string> and <string/> with <string>NULL</string> and then you can convert ‘NULL’ string to null.

Demo:

Result:

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement