Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r097

D toc

D s transforms

Excerpt

Replaces values within the specified column or columns based on the string literal, pattern, or location within the cell value, as specified in the transform. 

The replace transform is used primarily to match on patterns within a string. For entire cell replacement across all rows of the column, use the set transform. See Set Transform.

D s
snippetBasic

on parameter example: 

Specifies the string literal or pattern to match.

D code
replace col: text on: 'honda' with:'toyota' global: true

Output: Replaces all instances in the text column of honda with toyota. If honda appears twice a cell, both instances are replaced with toyota.

at parameter example: 

Specifies the beginning character and ending character as index values for the match.

D code

replace col: text at: 2,6 with:'replacement text'

Output: For all values in the text column, replace the substring between character 2 and 6 in the column with the value replacement text. If the length of the original cell value is less than 6, the replacement value is inserted.

D s
snippetParameters

D code

replace col:column_ref with:'literal_replacement' [at:(start_index,end_index)] [on:string_literal_pattern] [global:true|false]

TokenRequired?Data TypeDescription
replaceYtransformName of the transform
colYstringName of column where to make replacements
withYsee belowLiteral value with which to replace matched values
afterNstringString literal or pattern that precedes the pattern to match
atNArrayTwo-integer array identifying the character indexes of start and end characters to match
beforeNstringString literal or pattern that appears after the pattern to match
fromNstringString literal or pattern that identifies the start of the pattern to match
toNstringString literal or pattern that identifies the end of the pattern to match
onNstringString literal or pattern that identifies the cell characters to replace
globalNbooleanIf true, all occurrences of matches are replaced. Default is false.

D s lang notes

col

Identifies the column or columns to which to apply the transform. You can specify one or more columns. 

Include Page
col_multi Parameter
col_multi Parameter

D code

replace col: MyCol on: 'MyString' with: 'myNewString'

Output: Replaces value MyString in  MyCol column with myNewString.

D s
snippetusage

Required?Data Type
YesString (column name)

with

Include Page
with Parameter
with Parameter

For the replace transform, this value must be a literal value. You can apply values of String or other data types. After replacement, the column data type is re-inferred.

Info

NOTE: Some regular expression capture groups with references (such as $2) are supported across all running environments. See Capture Group References.

D s
snippetusage

Required?Data Type
YesLiteral of any data type

after

D code

replace col:Primary_URL with:'' after:`http({any}|):`

Output: All content after the protocol identifier (http: or https:) is dropped.

Include Page
after Parameter
after Parameter

D s
snippetusage

Required?Data Type
NoString (string literal or pattern)
  • The after and from parameters are very similar. from includes the matching value as part of the replaced string.
  • after can be used with either toon, or beforeSee  Pattern Clause Position Matching.

at

D code

replace col: MyCol at: 2,6 with:'MyNewString'

Output: Replace contents of MyCol that starts at the second character in the column and extends to the sixth character with the value MyNewString.

Include Page
at Parameter
at Parameter

D s
snippetusage

Required?Data Type
Must use either on or at parameterArray of two Integers ( X,Y )

before

Include Page
before Parameter
before Parameter

D code

replace col:credit_card with:'****-***-***-' after:`{start}` before:`({digit}{4}){end}`

Output:

  • Replaces first three groups of digits in the credit_card column with asterisks, effectively masking the number.

D s
snippetusage

Required?Data Type
NoString or pattern
  • The before and to parameters are very similar.  to includes the matching value as part of the replaced string.
  • before can be used with either fromon, or after. See  Pattern Clause Position Matching

from

Include Page
from Parameter
from Parameter

D code

replace col: MyCol from: '<END>' with: ''

Output: 

  • All content from the string <END> to the end of the string value in MyCol is removed.

D s
snippetusage

Required?Data Type
NoString or pattern
  • The after and from parameters are very similar.  from includes the matching value as part of the replaced string.
  • from can be used with either to or before. See  Pattern Clause Position Matching.

to

Include Page
to Parameter
to Parameter

D code

replace col:ssn with:'***-**-' to:`{digit}{3}-{digit}{2}-`

Output:

  • Replace first two number groups in the column ssn with asterisks to mask the data.

D s
snippetusage

Required?Data Type
NoString or pattern
  • The before and to parameters are very similar.  to includes the matching value as part of the replaced string.
  • to can be used with either from or after. See Pattern Clause Position Matching.

on

D code

replace col: MyCol on: `###ERROR` with:'No error here'

Include Page
on Parameter
on Parameter

Tip

Tip: You can insert the Unicode equivalent character for this parameter value using a regular expression of the form /\uHHHH/. For example, /\u0013/ represents Unicode character 0013 (carriage return). For more information, see Supported Special Regular Expression Characters.

D s
snippetusage

Required?Data Type
Must use either on or at parameterString or pattern

global

Include Page
global Parameter
global Parameter

Info

NOTE: If you have specified the pattern to match with the at parameter, then the number of possible replacement instances is only 1, so the global parameter is not used.

D s
snippetusage

Required?Data Type
No. Default is false.Boolean


D s
snippetExamples

Example - Clean up marketing contact data with replace, set, and extract

Include Page
EXAMPLE - Replacement Transforms
EXAMPLE - Replacement Transforms

Example - Using capture group references for replacements

The replace transform can take advantage of capture groups defined in the 

D s lang
itempatterns
 and regular expressions used to search for values within a column. A capture group is a sub-pattern within your pattern that defines a value that you can reference in the replacement.

Info

NOTE: For this transform, capture groups can be specified in the on parameter only.

In the following example, the on parameter defines two capture groups, and the with parameter references them in the replacement. In this example, any entry in the camel_case column that contains a lower-case letter followed immediately by an upper-case letter is replaced by the same value with a space inserted in the middle. The $1 value references the first capture group in the corresponding 

D s lang
itempattern
:

D trans
RawWrangletrue
p03Value'$1 $2'
Typestep
WrangleTextreplace col:camel_case with:'$1 $2' on:`({lower})({upper})` global:true
p01NameColumn
p01Valuecamel_case
p02NameFind
p02Value`({lower})({upper})`
p03NameReplace with
p04Valuetrue
p04NameMatch all occurrences
SearchTermReplace text or pattern

Capture GroupDescriptionReplacement Reference
({lower})A single lower-case letter$1
({upper})A single upper-case letter$2


D s also
labelwrangle_transform_replace