Removes all characters from a string that are not letters, numbers, accented Latin characters, or whitespace.
NOTE: Non-Latin letters are also removed.
Tip: This function also removes common punctuation, such as the following:
. , ! & ?
To preserve these characters, you might replace them with an alphanumeric text string. For example, the question mark might be replaced by:
zzQUESTIONMARKzz
After the function has been applied, you can replace these strings with the original values.
Basic Usage
Column reference example:
derive type:single value:REMOVESYMBOLS(MyName)
Output: The value of the MyName
column value with all non-alphanumeric characters removed.
String literal example:
derive type:single value:REMOVESYMBOLS('Héllõ, Wõrlds!?!?')
Output: The string Héllõ Wõrlds
is written to the new column.
Wildcard example:
set col:* value:REMOVESYMBOLS($col)
Output: Strips all non-alphanumeric or space characters from all columns in the dataset.
Syntax and Arguments
derive type:single value:REMOVESYMBOLS(column_string)
Argument | Required? | Data Type | Description |
---|---|---|---|
column_string | Y | string | Name of the column or string literal to be applied to the function |
For more information on syntax standards, see Language Documentation Syntax Notes.
column_string
Name of the column or string constant to be trimmed of symbols.
- Missing string or column values generate missing string results.
- String constants must be quoted (
'Hello, World'
). - Multiple columns and wildcards are not supported.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | String literal or column reference | myColumn |
Tip: For additional examples, see Common Tasks.
Examples
Example - String cleanup functions together
TRIM
- remove leading and trailing whitespace. See TRIM Function.REMOVEWHITESPACE
- remove leading and trailing whitespace and all whitespace in between. See REMOVEWHITESPACE Function.REMOVESYMBOLS
- remove all characters that are not alpha-numeric or whitespace. See REMOVESYMBOLS Function.
Source:
In the following (space)
and (tab)
indicate space keys and tabs, respectively. Carriage return and newline characters are also supported by whitespace functions.
Strings | source |
---|---|
String01 | this source(space)(space) |
String02 | (tab)(tab)this source |
String03 | (tab)(tab)this source(space)(space) |
String04 | this source's? |
String05 | Why, you @#$%^&*()! |
String06 | this söurce |
String07 | (space)this söurce |
String08 | à mañana |
Transform:
The following transforms generate new columns using each of the string cleanup functions:
derive type:single value: TRIM(source) as: 'trim_source'
derive type:single value: REMOVEWHITESPACE(source) as: 'removewhitespace_source'
derive type:single value: REMOVESYMBOLS(source) as: 'removesymbols_source'
Results:
Strings | source | removesymbols_source | removewhitespace_source | trim_source |
---|---|---|---|---|
String01 | this source(space)(space) | this source(space)(space) | thissource | this source |
String02 | (tab)(tab)this source | (tab)(tab)this source | thissource | this source |
String03 | (tab)(tab)this source(space)(space) | (tab)(tab)this source(space)(space) | thissource | this source |
String04 | this source's? | this sources | thissource's? | this source's? |
String05 | "Why, you @#$%^&*()!" | Why you | Why,you@#$%^&*()! | Why, you @#$%^&*()! |
String06 | this söurce | this söurce | thissöurce | this söurce |
String07 | (space)this söurce | (space)this söurce | thissöurce | this söurce |
String08 | à mañana | à mañana | àmañana | à mañana |
This page has no comments.