If you wish to maintain the original dataset values, you can apply an aggregate function within a single column.
Values to Columns
Similar to pivot, the Convert values to columns transformation converts individual values within a column to independent columns in the dataset. For each row, if the value represented by the column is present in the original data, one value is added (e.g.
Yes). If it's missing, another value is inserted (e.g.
Tip: This type of conversion can be useful for preparing data for machine learning systems. You can convert the presence or absence of specific values in a row to
In the following, the values in the
Store_Nbr column have been converted to individual columns:
In the above:
- Fill when present identifies the string literal value to insert if the row contains the column's value (
- Fill when missing identifies the string literal value to insert if the row does not contain the column's value (empty).
Max number of columns to create places a limit on the total number of columns that the application is permitted to create. In this case, the limit is set to
250since the known number of stores is 250.
Tip: It's a good habit to set limits on the maximum number of columns to create. Data can become sparse or unwieldy if limits are not considered.
|D s also|