Replaces values within the specified column or columns based on the string literal, pattern, or location within the cell value, as specified in the transform. |
The replace
transform is used primarily to match on patterns within a string. For entire cell replacement across all rows of the column, use the set
transform. See Set Transform.
on
parameter example:
Specifies the string literal or pattern to match.
replace col: text on: 'honda' with:'toyota' global: true |
Output: Replaces all instances in the text
column of honda
with toyota
. If honda
appears twice a cell, both instances are replaced with toyota
.
at
parameter example:
Specifies the beginning character and ending character as index values for the match.
replace col: text at: 2,6 with:'replacement text' |
Output: For all values in the text
column, replace the substring between character 2 and 6 in the column with the value replacement text
. If the length of the original cell value is less than 6, the replacement value is inserted.
replace col:column_ref with:'literal_replacement' [at:(start_index,end_index)] [on:string_literal_pattern] [global:true|false] |
Token | Required? | Data Type | Description |
---|---|---|---|
replace | Y | transform | Name of the transform |
col | Y | string | Name of column where to make replacements |
with | Y | see below | Literal value with which to replace matched values |
after | N | string | String literal or pattern that precedes the pattern to match |
at | N | Array | Two-integer array identifying the character indexes of start and end characters to match |
before | N | string | String literal or pattern that appears after the pattern to match |
from | N | string | String literal or pattern that identifies the start of the pattern to match |
to | N | string | String literal or pattern that identifies the end of the pattern to match |
on | N | string | String literal or pattern that identifies the cell characters to replace |
global | N | boolean | If true , all occurrences of matches are replaced. Default is false . |
Identifies the column or columns to which to apply the transform. You can specify one or more columns.
replace col: MyCol on: 'MyString' with: 'myNewString' |
Output: Replaces value MyString
in MyCol
column with myNewString
.
Required? | Data Type |
---|---|
Yes | String (column name) |
For the replace
transform, this value must be a literal value. You can apply values of String or other data types. After replacement, the column data type is re-inferred.
NOTE: Some regular expression capture groups with references (such as |
Required? | Data Type |
---|---|
Yes | Literal of any data type |
replace col:Primary_URL with:'' after:`http({any}|):` |
Output: All content after the protocol identifier (http:
or https:
) is dropped.
Required? | Data Type |
---|---|
No | String (string literal or pattern) |
after
and from
parameters are very similar. from
includes the matching value as part of the replaced string.after
can be used with either to
, on
, or before
. See Pattern Clause Position Matching.replace col: MyCol at: 2,6 with:'MyNewString' |
Output: Replace contents of MyCol
that starts at the second character in the column and extends to the sixth character with the value MyNewString
.
Required? | Data Type |
---|---|
Must use either on or at parameter | Array of two Integers ( X,Y ) |
replace col:credit_card with:'****-***-***-' after:`{start}` before:`({digit}{4}){end}` |
Output:
credit_card
column with asterisks, effectively masking the number.Required? | Data Type |
---|---|
No | String or pattern |
before
and to
parameters are very similar. to
includes the matching value as part of the replaced string.before
can be used with either from
, on
, or after
. See Pattern Clause Position Matching. replace col: MyCol from: '<END>' with: '' |
Output:
<END>
to the end of the string value in MyCol
is removed.Required? | Data Type |
---|---|
No | String or pattern |
after
and from
parameters are very similar. from
includes the matching value as part of the replaced string.from
can be used with either to
or before
. See Pattern Clause Position Matching.
replace col:ssn with:'***-**-' to:`{digit}{3}-{digit}{2}-` |
Output:
ssn
with asterisks to mask the data.Required? | Data Type |
---|---|
No | String or pattern |
before
and to
parameters are very similar. to
includes the matching value as part of the replaced string.to
can be used with either from
or after
. See Pattern Clause Position Matching.replace col: MyCol on: `###ERROR` with:'No error here' |
Tip: You can insert the Unicode equivalent character for this parameter value using a regular expression of the form |
Required? | Data Type |
---|---|
Must use either on or at parameter | String or pattern |
NOTE: If you have specified the pattern to match with the |
Required? | Data Type |
---|---|
No. Default is false . | Boolean |
The replace
transform can take advantage of capture groups defined in the and regular expressions used to search for values within a column. A capture group is a sub-pattern within your pattern that defines a value that you can reference in the replacement.
NOTE: For this transform, capture groups can be specified in the |
In the following example, the on
parameter defines two capture groups, and the with
parameter references them in the replacement. In this example, any entry in the camel_case
column that contains a lower-case letter followed immediately by an upper-case letter is replaced by the same value with a space inserted in the middle. The $1
value references the first capture group in the corresponding :
Capture Group | Description | Replacement Reference |
---|---|---|
({lower}) | A single lower-case letter | $1 |
({upper}) | A single upper-case letter | $2 |