Skip to content
Advertisement

Flatten data source in Snowflake from Array

I am trying to fix an array in a dataset. Currently, I have a data set that has a reference number to multiple different uuids. What I would like to do is flatten this out in Snowflake to make it so the reference number has separate row for each uuid. For example

Should end up looking like:

I just started working in Snowflake so I am new to it. It looks like there is a lateral flatten, but this is either not working on telling me that I have all sorts of errors with it. The documentation from snowflake is a bit perplexing when it comes to this.

Advertisement

Answer

While FLATTEN is the right approach when exploding an array, the UUID column value shown in the original description is invalid if interpreted as JSON syntax: "[""val1"", ""val2""]" and that’ll need correction before a LATERAL FLATTEN approach can be applied by treating it as a VARIANT type.

If your data sample in the original description is a literal one and applies for all columnar values, then the following query will help transform it into a valid JSON syntax and then apply a lateral flatten to yield the desired result:

If your data is already in a valid VARIANT type with a successful PARSE_JSON done for the UUID column during ingest, and the example provided in the description was just a formatting issue that only displays the JSON invalid in the post, then the simpler version of the same query as above will suffice:

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement