In the Transformer page:
- You can create new columns, modify them, and delete them to re-scope the size of your data to the most meaningful information.
- You can reshape your data through pivots and aggregations.
- Nested data in the form of Arrays or Objects (key-value pairs) can be un-nested across columns and rows for easier manipulation. As needed, patterned data can be re-nested through transformations that are easy to select and manipulate.
Tip: When reshaping your data from its original form, you may find it useful to build your pivots and aggregations as separate recipes created off of your current recipe. In this manner, you can preserve the original structure and explore more significant transformations as needed.
Transformations to Reshape Your Dataset
Recipe steps can change the number of rows in the dataset and apply wider impacts to your dataset and its samples.
These reshaping steps include the following transformations:
|Splitrows||Initial Parsing Steps|
|Expand Arrays into Rows||Working with Arrays|
|Filter Rows (keep or delete)||Remove Data|
|Pivot Table||Pivot Data|
|Unpivot Columns||Unpivot Columns|
|Join Datasets||Join Window|
|Union Datasets||Union Page|
|Select Lookup from the column menu||Lookup Wizard|
|Remove Duplicate Rows||Remove Data|
Samples and reshaping your datasets
When one of these transformations is applied and rows are removed from your dataset:
- Any samples generated before the step was added are invalidated and cannot be used.
- If you edit steps in your recipe before this added transformation, any samples that you generated after the step are invalidated and cannot be used.
- A valid initial sample is always available for use.
For more information, see Samples Panel .
Build Pivot Tables
You can reshape your data by building pivot tables. Pivot tables are useful when you want to calculate aggregation functions, such as sums, maximums, and averages for one or more columns of data.
In the following example, the data is reshaped to include the sum of
POS_Sales for each distinct value in the
Daily column across the values in the
For more information, see Pivot Data.
An aggregation is a computation across a grouped set of rows. Dataprep by Trifacta provides a wide range of aggregation functions that you can apply:
- To an entire column (called a flat aggregation)
- To generate a new column
- To use to reshape your entire table
For more information, see Create Aggregations.
Nest and Unnest
You can combine data in separate columns into single-column values stored in Arrays or Objects (maps). Similarly, data from an Array or Object column can be converted into new rows or columns based on the keys in the source data. For more information:
You can select a set of columns to replace the current dataset completely. See Select.
You can reshape your data by deleting unwanted columns in the dataset. You can delete a single column or multiple columns.
- To delete a column from your dataset, click the required column and select Delete from the column drop-down.
- If you select Delete others, all other remaining columns are deleted except the selected column.
Tip: To delete multiple columns, select them in the data grid or column browser. Then select Delete from the column menu.
The above menu choices get turned into recipe steps that use the
Delete columns transformation.
Tip: While using Delete columns transformation, you can use the tilde (
~) character between the start and end column names to delete a range of columns.
See Delete Data.
You can split a column based on one or more known delimiters or based on index positions in the data. See Split Column.
This page has no comments.