Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

...

ProdIdProdNameSizes
1001Hat{'Small':'N','Medium':'Y','Large':'Y','Extra-Large':'Y'}
1002Shirt{'Small':'N','Medium':'Y','Large':'Y','Extra-Large':'N'}
1003Pants{'Small':'Y','Medium':'Y','Large':'Y','Extra-Large':'N'}

TransformTransformation:

Info

NOTE: Depending on the format of your source data, you might need to perform some replacements in the Sizes column in order to make it inferred as proper Object type values. The final format should look like the above.

If it is not inferred already, set the type of the Sizes column to Object:

d-

...

trans
RawWrangletrue
Typestep
WrangleTextsettype col: Sizes type: 'Object'
p01NameColumns
p01ValueSizes
p02NameNew type
p02ValueObject
SearchTermChange column data type

Unnest the data into separate columns. The following prepends Sizes_ to the newly generated column name.

d-

...

trans
RawWrangletrue
p03Valuetest
Typestep
WrangleTextunnest col:Sizes keys:'Small','Medium','Large','Extra-Large' markLineage:true
p01NameColumn
p01ValueSizes
p02NamePaths to elements
p02Value'Small','Medium','Large','Extra-Large'
p03NameInclude original column name
SearchTermUnnest Objects into columns

You might find it useful to add pluck:true to the above transform. When added, values that are un-nested are removed from the source, leaving only the values that weren't processed:

d-

...

trans
RawWrangletrue
p03Valuetrue
Typestep
WrangleTextunnest col:Sizes keys:'Small','Medium','Large','Extra-Large' markLineage:true pluck:true
p01NameColumn
p01ValueSizes
p02NamePaths to elements
p02Value'Small','Medium','Large','Extra-Large'
p03NameRemove elements from original
p04Valuetrue
p04NameInclude original column name
SearchTermUnnest Objects into columns

If all values have been processed, the  Sizes column now contains a set of maps missing data. You can use the following to determine if the length of the remaining data is longer than two characters. This transform is a good one to just preview:

d-

...

trans
RawWrangletrue
p03Value'len_Sizes'
Typestep
WrangleTextderive type:single value:(

...

len(Sizes) > 2) as:'len_Sizes'
p01NameFormula type
p01ValueSingle row formula
p02NameFormula
p02Value(len(Sizes) > 2)
p03NameNew column name
SearchTermNew formula

You can delete the source column:

d-

...

trans
RawWrangletrue
Typestep
WrangleTextdrop col:Sizes
p01NameColumns
p01ValueSizes
p02NameAction
p02ValueDelete selected columns
SearchTermDelete columns

Results:

When you are finished, the dataset should look like the following:

...