Through simple selections, you can identify columns to remove, values on which to base row deletion, or strings to remove from your dataset. As needed, these transformations can be modified for more sophisticated removal transformations.
Please keep in mind:
To delete a column from your dataset, click the column drop-down and select Delete. The data is no longer available in the data grid or subsequent recipe steps.
Tip: To delete multiple columns, select them in the data grid or column browser. Then select Delete from the column menu.
Tip: To simply remove columns from display, use the Hide command. The hidden column still appears in the output.
To delete multiple columns, you can specify comma-separated column names in your Delete Columns transformation:
To delete a range of columns, use the tilde (
~) character between the start and end column names:
For more information, see Remove Data.
You can delete rows in your dataset based on conditional patterns that you specify. The easiest method is to select a string in the appropriate column and then choose the Delete suggestion card.
In the following example, each row contains an entry for a different business, and you want to remove all of the business entries from the city of Tempe.
citycolumn, or you can use the Filters panel to filter for rows containing the value
Then, select the Delete suggestion card.
Select Tempe in the City column to remove all entries for that city
Delete, the application evaluates your selected value and attempt your intention with the selection. Is it a string literal or a pattern? If it's a pattern, what does the pattern represent? You may select one of the variants in the Delete card to find the right match.
NOTE: Be sure to scroll up and down in the data grid to review the values that are affected. In some cases, your selection may turn into a pattern, which could apply to more than just the desired values. In the previous example, selecting
Tip: You can also use the Filter Rows to retain rows based on a specified condition, effectively deleting the rows that do not match. See Filter Data.
You can delete or keep rows in your dataset based on one or more matching conditions you define.
true, then the data is filtered.
For more information, see Filter Data.
You can delete or keep rows based on whether a cell value in the row matches a specified data type. The following example removes rows that do not match the
mm*dd*yy format for the Datetime data type from the
If you have rows in your dataset that contain no data, you can use the following two steps to remove them. Assuming that you know the starting (
col1) and ending (
colN) column names of your dataset, try the following:
NOTE: If at a later time, you reorder or remove the starting or ending columns in a step before this one, these steps are broken.
The above merges all values into a single value in the
all_blank_vals column. The second step removes the row if the value in the merged column is blank.
Remember to delete the
all_blank_vals column after you are done.
For more information, see Filter Data.
To delete values from a column, select the values in the data grid. In the suggestion cards, select the
Replace card. In the following example, the
city column is removed of all values matching
The Replace transformation applies only to string values. The rest of a matching row is unaffected.
The above transformation matches all values in the column, even partial values, the match string is removed from the column value. For example, an entry
Tempest would be turned into
st if the above transformation was added.
To ensure that only full-column value matches are applied, you can add to indicate the start and end of the column value as in the following:
In the above case, only values of
Tempe that are the entire column value are matched. For more information on this pattern-based matching, see Text Matching.
For more sophisticated matching, you can apply regular expressions to your
replace command. In the following example, all integers from 0-99 are matched in the
qty column. Because there is no replacement value, they are deleted.
Regular expressions are very powerful pattern matching tools. You should be careful in your use of them. See Text Matching.
|Beginning of string. Required to prevent matching on the last digit of any numeric value.|
|End of string. Required to prevent a 2-digit match on three-digit numbers.|
|\||A single digit|
|Logical or. In this case, it is used to define separate regexes for 1- and 2-digit values.|