On January 27, 2021, Google is changing the required permissions for attaching IAM roles to service accounts. If you are using IAM roles for your Google service accounts, please see Changes to User Management.
Contents:
Input can be specified as a column reference, a function returning a string, or a string literal, although string literal usage is rare.
- A column reference can refer to a column of String type.
- If no match is found, the function returns the source string.
- If multiple matches are found in a single string, all replacements are made.
Wrangle vs. SQL: This function is part of Wrangle , a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.
Basic Usage
Column reference example:
substitute(myURL,`{ip-address}`,myDomain)
Output: Searches the myURL
column values for sub-strings that match valid IP addresses. Where matches are found, they are replaced with the corresponding value in the myDomain
column.
Function reference example:
substitute(upper(companyName),'ACME','New ACME')
Output: Searches the uppercase version of values from the companyName
column for the string literal ACME
. When found, these matches are replaced by New ACME
in the companyName
column.
Syntax and Arguments
substitute(string_source,string_pattern,string_replacement[,ignore_case, pattern_before, pattern_after])
Argument | Required? | Data Type | Description |
---|---|---|---|
string_source | Y | string | Name of the column, a function returning a string, or string literal to be applied to the function |
string_pattern | Y | string | String literal or pattern or a column or a function returning strings to find |
string_replacement | Y | string | String literal, column or function returning a string to use as replacement |
ignore_case | N | string | When true , matching is case-insensitive. Default is false . |
pattern_before | N | string | String literal or pattern to find before finding the string_pattern value. |
pattern_after | N | string | String literal or pattern to find after finding the string_pattern value. |
For more information on syntax standards, see Language Documentation Syntax Notes.
string_source
Name of the item to be searched. Valid values can be:
- String literals must be quoted (
'Hello, World'
). - Column reference to any type that can be inferred as a string, which encompasses all values
- Functions that return string values
Multiple values and wildcards are not supported.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | String literal or column reference (String, Array, or Object) | myColumn |
string_pattern
String to find. This value can be a string literal, a Pattern , a regular expression, a column, or a function returning a String value.
- String literals must be quoted (
'Hello, World'
). - Multiple values and wildcards are not supported.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | String value or pattern or column reference (String) | 'Hello' |
string_replacement
Value with which to replacement any matched patterns. Value can be a string, a function returning string values, or a column reference containing strings.
- String literals must be quoted (
'Hello, World'
). - Column reference to any type that can be inferred as a string, which encompasses all values.
- Multiple values and wildcards are not supported.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | String literal, column reference (String, Array, or Object), or function returning String value | '##REDACTED##' |
ignore_case
When true
, matches are case-insensitive. Default is false
.
NOTE: This argument is not required. By default, matches are case-sensitive.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
No | String value | 'false'
|
pattern_before
String literal or pattern to find in a position before the pattern to match.
NOTE: This argument is not permitted when string_ pattern
or string_replacement
is of column data type.
Tip: Use this argument if there are potentially multiple instances of the pattern to match in the source.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
No | String literal or pattern | `{digit}{3}` |
pattern_after
String literal or pattern to find in a position after the pattern to match.
NOTE: This argument is not permitted when string_ pattern
or string_replacement
is of column data type.
Tip: Use this argument if there are potentially multiple instances of the pattern to match in the source.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
No | String literal or pattern | ' '
|
Tip: For additional examples, see Common Tasks.Examples
Example - Partial obfuscation of credit card numbers
Source:
Suppose you have the following transactional data, which contains customer credit card numbers.
TransactionId | CreditCardNum | AmtDollars |
---|---|---|
T001 | 4111-1111-1111-1111 | 100.29 |
T002 | 5500-0000-0000-0004 | 510.21 |
T003 | 3400-0000-0000-009 | 162.13 |
T004 | 3000-0000-0000-04 | 294.12 |
For security purposes, you wish to redact the first three sets of digits, so only the last set of digits appears.
Transformation:
To make the substitution, you must first change the type of the column to be a string:
Transformation Name | Change column data type |
---|---|
Parameter: Columns | CreditCardNum |
Parameter: New type | 'String' |
You can then use the following transformation to perform the pattern-based replacement of four-digit sets that end in a dash with XXXX
:
Transformation Name | Edit column with formula |
---|---|
Parameter: Columns | CreditCardNum |
Parameter: Formula | substitute(CreditCardNum, `{digit}+\-`, 'XXXX-') |
To indicate that the column no longer contains valid information, you might choose to rename it like in the following:
Transformation Name | Rename columns |
---|---|
Parameter: Option | Manual rename |
Parameter: Column | CreditCardNum |
Parameter: New column name | 'CreditCardNumOBSCURED' |
Results:
TransactionId | CreditCardNumOBSCURED | AmtDollars |
---|---|---|
T001 | XXXX-XXXX-XXXX-1111 | 100.29 |
T002 | XXXX-XXXX-XXXX-0004 | 510.21 |
T003 | XXXX-XXXX-XXXX-009 | 162.13 |
T004 | XXXX-XXXX-XXXX-04 | 294.12 |
This page has no comments.