Transform a Row
You can transform rows in your dataset based on the conditions you specify or by using a custom formula.
Deduplicate Rows
You can use the Remove duplicate rows
transformation to remove identical rows from the dataset.
Steps:
In the Transformer page, click the Recipe icon. The Recipe Panel is displayed.
In the Search Transformations panel, enter
Remove duplicate rows
. The duplicated rows are highlighted in the Transformer page.To remove the duplicate rows, click Add.
Limitations:
This transformation is case-sensitive. So, if a column has values
Hash
andHASH
, the rows containing those values are not considered duplicates and cannot be removed with this transformation.Whitespace and the beginning and ending of values is not ignored.
Delete Rows on a Condition
By selection
You can delete the rows for any column based on a condition that you select. When you select values in the data quality bar that for a column, a set of predictive suggestions is displayed in the right panel. Locate the Delete rowstransformation, which contains a where
clause to indicate the condition on which a row is deleted.
For example, if your data contains mismatched values, select the red bar in the data quality bar and use the Delete rows transformation to delete the mismatched values from the dataset:
By custom condition
You can also delete a set of rows based on a condition expression that you specify. If the conditional expression evaluates to true
, then the row is deleted.
In the following example, you can delete a set of POS_Sales
values based on a conditional parameter. All POS_Sales
where the value is less than or equal to 5 are deleted, and the rest are retained.
Store_Nbr | Item_Nbr | POS_Sales | POS_Qty | POS_Cost |
---|---|---|---|---|
200 | 322000 | 22.00 | 44 | 23.76 |
200 | 323000 | 1.75 | 1 | 2.43 |
200 | 325000 | 0.00 | 0 | 0.00 |
200 | 325000 | 9.44 | 8 | 7.44 |
200 | 326000 | 2.00 | 2 | 1.64 |
200 | 327000 | 5.00 | 10 | 5.40 |
200 | 328000 | 7.00 | 7 | 6.51 |
200 | 341000 | 0.00 | 0 | 0.00 |
200 | 341000 | 7.78 | 21 | 19.53 |
200 | 342000 | 0.00 | 0 | 0.00 |
200 | 350000 | 5.00 | 10 | 5.40 |
Steps:
In the Transformer page, click the Recipe icon. The Recipe panel is displayed.
In the Search Transformations panel, enter
Filter in
.In the Filter rows transformation, enter the required details:
Condition: Filter based on the condition type that you select in the drop-down. Some condition types do not support specifying the condition by formula.
Column: The column containing the values to filter. For example,
POS_Sales
.Values or Formula: Specify the values or the formula used to determine the condition.
If these values are present, then the condition evaluates to
true
.The formula must evaluate to
true
orfalse
.
Action: The action to be performed to the rows based on the specified conditions. In this case, select
Delete matching rows
.Transformation Name
Filter rows
Parameter: Condition
Less than or equal to
Parameter: Column
POS_Sales
Parameter: Value
POS_Sales <= 5
Parameter: Action
Delete matching rows
To add the recipe to the step, click Add. The selected rows are deleted from the dataset.
Results:
Store_Nbr | Item_Nbr | POS_Sales | POS_Qty | POS_Cost |
---|---|---|---|---|
200 | 322000 | 22.00 | 44 | 23.76 |
200 | 325000 | 9.44 | 8 | 7.44 |
200 | 328000 | 7.00 | 7 | 6.51 |
200 | 341000 | 7.78 | 21 | 19.53 |
Filter the Data Grid
You can filter the sampled data to display only a subset of rows in the Transformer page. You can filter the dataset by rows, columns, and data types.
Columns or rows that are filtered are hidden only from the view. The hidden data is part of the sample and the dataset and is included in the output.
Steps:
In the Transformer page, click the Filter icon.
In the Filter by name panel, enter the required options:
Columns: Search for individual columns or filter columns of a specific type. Filtered columns are displayed, and the rest are hidden.
Rows: Highlight search term matches found in any column for a row.
Note
You can also filter the rows and columns by data type.
To clear all filters, click Clear all filters.
Hide and show columns
In the Transformer page, click the Eye icon in the status bar at the bottom of the page. Click one or more columns to show or hide them.
Astuce
You can also use the Columns menu in the Transformer page to show or hide columns.
Note
When a column is hidden from a dataset, it is hidden for all users. You should check the column browser for hidden columns in shared datasets.
Filter during previews
When you are creating or editing transformations, the data grid displays a preview of results of the transformation in development. As needed, you can narrow the display to show only the affected rows, columns, or both. Select the Show only affected rows checkbox in the status bar at the bottom of the Transformer page.