Unpacks nested data from an Array or Object column to create new rows or columns based on the keys in the source data.   

This transform works differently on columns of Array or Object type.

The  unnest  transform must include keys that you specify as part of the transform step. To unnest a column of array data that contains no keys, use the flatten transform. See Flatten Transform.

This transform might be automatically applied as one of the first steps of your recipe. See Initial Parsing Steps.

 

unnest col: myObj keys:'sourceA','sourceB' pluck:true markLineage:true

Output:

unnest col:column_ref keys:'key1','key2' [pluck:true|false] [markLineage:true|false]

TokenRequired?Data TypeDescription
unnestYtransformName of the transform
colYstringSource column name
keysYstringComma-separated list of quoted key names. See below for examples.
pluckNbooleanIf true, any values unnested from the source are also removed from the source. Default is false.
markLineageNbooleanIf true, the names of new columns are prepended with the name of the source column.

col

Identifies the column to which to apply the transform. You can specify only one column.

Required?Data Type
YesString (column name)

keys

NOTE: Keys that contain non-alphanumeric values, such as spaces, must be enclosed in square brackets and quotes. Values with underscores do not require this bracketing.

The comma-separated list of keys determines the columns to generate from the source data. If you specify three values for keys, the three new columns contain the corresponding values from the source column.

This parameter has different syntax to use for single-level and multi-level nested data. There are also variations in syntax between Object and Array data type.

Required?Data Type
Yes

Comma-separated String values.

Syntax examples are provided below.

Keys for Object data - single-level

NOTE: Key names are case-sensitive.

For a single, top-level key in an Object field, you can specify the key as a simple quoted string:

unnest col:myCol keys: 'myObjKey'

The above looks for the key myObjKey among the top-level keys in the Object and returns the corresponding value for the new column. You can also bracket this key in square brackets:

unnest col:myCol keys: '[myObjKey]'

To specify multiple first-level keys, use the following:

unnest col:myCol keys:'myObjKey','my2ndObjKey'

The above generates two new columns ( myObjKey and my2ndObjKey) containing the corresponding values for the keys.

Keys for Object data - multi-level

You can also reference keys that are below the first level in the Object. 

Example data:

{ "Key1" :
  { "Key1A" :
    { "Key1A1" : "Value1" }
  }
}
{ "Key2" :
  { "Key2A" :
    { "Key2A1" : "Value2" }
  }
}
{ "Key3" :
  { "Key3A" :
    { "Key3A1" : "Value3" }
  }
}

To acquire the data for the Key1A key, use the following:

unnest col: myCol keys: 'Key1[Key1A]'

In the new column, the displayed value is the following:

{ "Key1A1" : "Value1" }

To unnest a third-layer value, use a transform similar to the following:

unnest col: myCol keys: 'Key2[Key2A][Key2A1]'

In the new column, this transform generates a value of Value2.

Keys for Array data - single level

You can reference array elements using zero-based indexes or key names.

NOTE: All references to Array keys must be bracketed. Array keys can be referenced by index number only.

Example array data:

["red","orange","yellow","green","blue","indigo","violet"]

unnest col: myCol keys:'[1]'

The above transform retrieves the value orange from the array.  

unnest col: myCol keys:'[1]','[3]'

Returned values: orange and green.

Keys for Array data - multi-level

The following example nested Array data matches the structure of the Object data in the previous example:

[ [ "Item1", ["Item1A", ["Item1A1","Value1"] ] ], [ "Item2", ["Item2A",  ["Item2A1","Value2"] ] ], [ "Item3", ["Item3A",["Item3A1","Value3"] ] ] ] 

To unnest the value for Items2A:

unnest col:myCol keys:'[1][0]'

The value inserted into the new column is the following:

["Item2A1","Value2"]

To unnest from the third level:

unnest col:myCol keys:'[2][0][0]'

The inserted value is Item3A.

pluck

Required?Data Type
NoBoolean

markLineage

NOTE: If your unnest transform does not change the number of rows, you can still access source row number information in the data grid, assuming it was still available when the transform was executed.

Required?Data Type
NoBoolean


Example - Unnest an Object

You have the following dataset. The Sizes column contains Object data on available sizes. 

Source:

ProdIdProdNameSizes
1001Hat{'Small':'N','Medium':'Y','Large':'Y','Extra-Large':'Y'}
1002Shirt{'Small':'N','Medium':'Y','Large':'Y','Extra-Large':'N'}
1003Pants{'Small':'Y','Medium':'Y','Large':'Y','Extra-Large':'N'}

Transformation:

NOTE: Depending on the format of your source data, you might need to perform some replacements in the Sizes column in order to make it inferred as proper Object type values. The final format should look like the above.

If it is not inferred already, set the type of the Sizes column to Object:

Unnest the data into separate columns. The following prepends Sizes_ to the newly generated column name.

You might find it useful to add pluck:true to the above transform. When added, values that are un-nested are removed from the source, leaving only the values that weren't processed:


If all values have been processed, the  Sizes column now contains a set of maps missing data. You can use the following to determine if the length of the remaining data is longer than two characters. This transform is a good one to just preview:

You can delete the source column:

Results:

When you are finished, the dataset should look like the following:

ProdIdProdNameSizes_SmallSizes_MediumSizes_LargeSizes_Extra-Large
1001HatNYYY
1002ShirtNYYN
1003PantsYYYN

Example - Unnest an array

The following example demonstrates differences between the unnest and the flatten transform, including how you use unnest to flatten array data based on specified keys.

 

Example - extracting key values from car data and then unnesting into separate columns