Excerpt |
---|
Matches some or all of a string, based on the user-defined starting and ending index values within the string. |
- Input can be a column reference or a string must be a string literal value.
- Since the
SUBSTRING
function matches based on fixed numeric values, changes to the length or structure of a data field can cause your recipe to fail to properly execute. - The
SUBSTRING
function requires numerical values for the starting and ending values. If you need to match strings using patterns, you should use the extract
transform instead. See Extract Transform.
Column reference example:
D coded-lang-syntax |
---|
RawWrangle | true |
---|
Type | ref |
---|
showNote | true |
---|
WrangleText | derive type:single value: |
---|
| SUBSTRING(MyNamesubstring('Hello, World',0,5) |
|
Output: The first five letters of the MyName
column value is written to the new column.
String literal example:
D code |
---|
derive type:single value:SUBSTRINGsubstring('Hello, World',0,5) |
Output
...
: Returns the string: Hello
.
d-codelang-syntax |
---|
RawWrangle | true |
---|
Type | syntax |
---|
showNote | true |
---|
WrangleText | derive type:single value: |
---|
| SUBSTRING(column_stringsubstring(string_val,start_index,end_index) |
|
substring(string_val,start_index,end_index) |
Argument | Required? | Data Type | Description |
---|
columnstring_stringval | Y | string | Name of the column or string String literal to be applied to the function |
start_index | Y | integer (non-negative) | Index value for the start character from the source column or value |
end_index | Y | integer (non-negative) | Index value for the end character from the source column or value |
...
string_
...
val
String constant to be searched.
- Missing string or column values generate missing string results.
- String constants must be quoted (
'Hello, World'
). - Multiple columns and wildcards are not supported.
...
Data Type | Required? | Example Value |
---|
String | YesMyName | 'This is my string.' |
start_index
Index value of the character in the string to begin the substring match.
...
LastName | ZipCode |
---|
Able | 94101 |
Baker | 23502-1122 |
Charlie | 36845 |
TransformTransformation:
You are interested in the region and area or town identifiers within a zip code region. You can use the following transforms transformations applied to the ZipCode
column to extract this information:
...
trans |
---|
RawWrangle | true |
---|
Type | step |
---|
WrangleText | derive type:single value: |
---|
|
...
substring(ZipCode,1,3) | p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | substring(ZipCode,1,3) |
---|
SearchTerm | New formula |
---|
|
...
trans |
---|
RawWrangle | true |
---|
Type | step |
---|
WrangleText | derive type:single value: |
---|
|
...
substring(ZipCode,3,5) | p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | substring(ZipCode,3,5) |
---|
SearchTerm |
---|
|
...
Since the string can be five or ten characters in length, you need to use the SUBSTRING
function in the second
...
transformation, too. If the data is limited to five-digit zip codes, you could use the RIGHT
function.
Results:
LastName | ZipCode | substring_ZipCode | substring_ZipCode2 |
---|
Able | 94101 | 41 | 01 |
Baker | 23502-1122 | 35 | 02 |
Charlie | 36845 | 68 | 45 |
...