Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r0811
Excerpt

This example shows how you can

...

break out a column of nested values into separate rows and columns of data.

Source:

The following data covers magazine subscriptions for individual customers. Their subscriptions are stored in an array of values. You are interested in who is subscribing to each magazine. 

CustIdSubscriptions
Anne Aimes["Little House and Garden","Sporty Pants","Life on the Range"]
Barry Barnes["Sporty Pants","Investing Smart"]
Cindy Compton["Cakes n Pies","Powerlifting Plus","Running Days"]
Darryl Diaz["Investing Smart","Cakes n Pies"]

TransformTransformation:

When this data is loaded into the Transformer, you might need to apply a header to it. If it is in CSV format, you might need to apply some replace transforms transformations to clean up the Subscriptions column so it looks like the above. 

When the Subscriptions column contains cleanly formatted arrays, the column is re-typed as Array type. You can then apply the flatten transformthe following transformation:

 

d-

...

trans
RawWrangletrue
Typestep
WrangleTextflatten col:Subscriptions
p01NameColumn
p01ValueSubscriptions
SearchTermExpand Array into rows

Each CustId/Subscription combination is now written to a separate row. You can use this new data structure to break out instances of magazine subscriptions. Using the following transformtransformation, you can add the corresponding CustId value to the column:

d-

...

trans
RawWrangletrue
Typestep
WrangleTextvaluestocols col:Subscriptions value:CustId

...

Drop

p01NameColumn
p01ValueSubscriptions
p02NameFill when present
p02ValueCustId
SearchTermConvert values to columns

Delete the two source columns:

d-

...

trans
RawWrangletrue
Typestep
WrangleTextdrop col:CustId,Subscriptions
p01NameColumns
p01ValueCustId,Subscriptions
p02NameAction
p02ValueDelete selected columns
SearchTermDelete columns

Results:

Little_House_and_GardenSporty_PantsLife_on_the_RangeInvesting_SmartCakes_n_PiesPowerlifting_PlusRunning_Days
Anne Aimes      
 Anne Aimes     
  Anne Aimes    
 Barry Barnes     
   Barry Barnes   
    Cindy Compton  
     Cindy Compton 
      Cindy Compton
   Darryl Diaz   
    Darry Diaz  

 

D s also
labelexample_flatten_and_valuestocols_transforms