Skip to content
Advertisement

BigQuery query nested json

I have JSON data which is saved in BigQuery as a string.

{
   "event":{
      "action":"prohibitedSoftwareCheckResult",
      "clientTime":"2017-07-16T12:55:40.828Z",
      "clientTimeZone":"3",
      "serverTime":"2017-07-16T12:55:39.000Z",
      "processList":{
         "1":"outlook.exe",
         "2":"notepad.exe"
      }
   },
   "user":{
      "id":123456,

   }
}

I want to have a result set where each process will be in a different row. Something like:

UserID        ProcessName
-------------------------
123456        outlook.exe
123456        notepad.exe

I saw there is an option to query repeated data but the field needs to be RECORD type to my understanding.

Is it possible to convert to RECORD type “on the fly” in a subquery? (I can’t change the source field to RECORD).

Or, is there a different way to return the desired result set?

Advertisement

Answer

This could be a possible work around for you:

SELECT
  user_id,
  processListValues
FROM(
  SELECT 
    JSON_EXTRACT_SCALAR(json_data, '$.user.id') user_id,
    REGEXP_EXTRACT_ALL(JSON_EXTRACT(json_data, '$.event.processList'), r':"([a-zA-Z0-9.]+)"') processListValues
  FROM data
),
UNNEST(processListValues) processListValues

Using your JSON as example:

WITH data AS(
SELECT """{
   "event":{
      "action":"prohibitedSoftwareCheckResult",
      "clientTime":"2017-07-16T12:55:40.828Z",
      "clientTimeZone":"3",
      "serverTime":"2017-07-16T12:55:39.000Z",
      "processList":{
         "1":"outlook.exe",
         "2":"notepad.exe",
         "3":"outlo3245345okexe"
      }
   },
   "user":{
      "id":123456,

   }
}""" as json_data
)

SELECT
  user_id,
  processListValues
FROM(
  SELECT 
    JSON_EXTRACT_SCALAR(json_data, '$.user.id') user_id,
    REGEXP_EXTRACT_ALL(JSON_EXTRACT(json_data, '$.event.processList'), r':"([a-zA-Z0-9.]+)"') processListValues
  FROM data
),
UNNEST(processListValues) processListValues

Results:

Row user_id processListValues    
1   123456  outlook.exe  
2   123456  notepad.exe  
3   123456  outlo3245345okexe    
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement