Page tree



Contents:

The cloud-based version of Trifacta Wrangler is now available! Read all about it, and register for your free account.

Contents:


Replaces found string literal or pattern from a source with a string.

Input can be specified as a column reference, a function returning a string, or a string literal, although string literal usage is rare.

  • A column reference can refer to a column of String type.
  • If no match is found, the function returns the source string.
  • If multiple matches are found in a single string, all replacements are made.

Basic Usage

Column reference example:

substitute(myURL,`{ip-address}`,myDomain)

Output: Searches the myURL column values for sub-strings that match valid IP addresses. Where matches are found, they are replaced with the corresponding value in the myDomain column. 

Function reference example:

substitute(upper(companyName),'ACME','New ACME')

Output: Searches the uppercase version of values from the companyName column for the string literal ACME. When found, these matches are replaced by New ACME in the companyName column.

Syntax

substitute(string_source,string_pattern,replacement_string)


ArgumentRequired?Data TypeDescription
string_sourceYstringName of the column, a function returning a string, or string literal to be applied to the function
string_patternYstringString literal or pattern to find
string_replacementYstringString literal to use as replacement
pattern_beforeNstringString literal or pattern to find before finding the string_pattern value.
pattern_afterNstringString literal or pattern to find after finding the string_pattern value.

For more information on syntax standards, see Language Documentation Syntax Notes.

string_source

Name of the item to be searched. Valid values can be:

  • String literals must be quoted ( 'Hello, World' ).
  • Column reference to any type that can be inferred as a string, which encompasses all values
  • Functions that return string values

Multiple values and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesString literal or column reference (String, Array, or Object)myColumn

string_pattern

String literal or pattern to find. This value can be a string literal, a Trifacta® pattern, or a regular expression.

  • String literals must be quoted ('Hello, World').
  • Multiple values and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesString literal or pattern'Hello'

string_replacement

Value with which to replacement any matched patterns. Value can be a string, a function returning string values, or a column reference containing strings.

  • String literals must be quoted ('Hello, World').
  • column reference to any type that can be inferred as a string, which encompasses all values. 

Multiple values and wildcards are not supported.

Usage Notes:

 

Required?Data TypeExample Value
YesString literal or column reference (String, Array, or Object)'##REDACTED##'

pattern_before

String literal or pattern to find in a position before the pattern to match. 

Tip: Use this parameter if there are potentially multiple instances of the pattern to match in the source.

Usage Notes:

 

Required?Data TypeExample Value
NoString literal or pattern`{digit}{3}`

pattern_after

String literal or pattern to find in a position after the pattern to match. 

Tip: Use this parameter if there are potentially multiple instances of the pattern to match in the source.

Usage Notes:

Required?Data TypeExample Value
NoString literal or pattern' '

Examples

Example - Partial obfuscation of credit card numbers

Source:

Suppose you have the following transactional data, which contains customer credit card numbers. 

TransactionIdCreditCardNumAmtDollars
T0014111-1111-1111-1111100.29
T0025500-0000-0000-0004510.21
T0033400-0000-0000-009162.13
T0043000-0000-0000-04294.12

For security purposes, you wish to redact the first three sets of digits, so only the last set of digits appears. 

Transformation:

To make the substitution, you must first change the type of the column to be a string:

Transformation Name Change column data type
Parameter: Columns CreditCardNum
Parameter: New type 'String'

You can then use the following transformation to perform the pattern-based replacement of four-digit sets that end in a dash with XXXX:

Transformation Name Edit column with formula
Parameter: Columns CreditCardNum
Parameter: Formula substitute(CreditCardNum, `{digit}+\-`, 'XXXX-')

To indicate that the column no longer contains valid information, you might choose to rename it like in the following:

Transformation Name Rename columns
Parameter: Option Manual rename
Parameter: Column CreditCardNum
Parameter: New column name 'CreditCardNumOBSCURED'

Results:

TransactionIdCreditCardNumOBSCUREDAmtDollars
T001XXXX-XXXX-XXXX-1111100.29
T002XXXX-XXXX-XXXX-0004510.21
T003XXXX-XXXX-XXXX-009162.13
T004XXXX-XXXX-XXXX-04294.12

Your Rating: Results: 1 Star2 Star3 Star4 Star5 Star 12 rates

This page has no comments.