Returns a two-element array of primary and secondary phonetic encodings for an input string, based on the Double Metaphone algorithm.
The Double Metaphone algorithm processes an input string to render a primary and secondary spelling for it. For English language words, the algorithm removes silent letters, normalizes combinations of characters to a single definition, and removes vowels, except from the beginnings of words. In this manner, the algorithm can normalize inconsistencies between spellings for better matching. For more information, see https://en.wikipedia.org/wiki/Metaphone.
Tip: This function is useful for performing fuzzy matching between string values, such as between potential join key values.
Source values can be string literals, column references, or expressions that evaluate to strings.
Wrangle vs. SQL: This function is part of Wrangle, a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.
Basic Usage
String literal reference example:
doublemetaphone('My String')
Output: See below.
["MSTRNK","MSTRNK"]
Column reference example:
doublemetaphone(string1)
Output: Generates a new double_metaphone
column containing the evaluation of string1
column values through the Double Metaphone algorithm.
Syntax and Arguments
doublemetaphone(string_ref)
Argument | Required? | Data Type | Description |
---|---|---|---|
string_ref | Y | string | Name of column or string literal to apply to the function |
For more information on syntax standards, see Language Documentation Syntax Notes.
string_ref1
String literal, column reference, or expression whose elements you want to filter through the Double Metaphone algorithm.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | String literal, column reference, or expression evaluating to a string | myString1 |
Tip: For additional examples, see Common Tasks.
Examples
Example - Phonetic string comparisons
Functions: Source: The following table contains some example strings to be compared. You can use the You can compare Results: The following table contains some example strings to be compared. Item Description
DOUBLEMETAPHONE Function
Returns a two-element array of primary and secondary phonetic encodings for an input string, based on the Double Metaphone algorithm.
DOUBLEMETAPHONEEQUALS Function
Compares two input strings using the Double Metaphone algorithm. An optional threshold parameter can be modified to adjust the tolerance for matching.
string1 string2 notes My String my string comparison is case-insensitive judge juge typo knock nock silent letters white wite missing letters record record two different words in English but match the same pair pear these match but are different words. bookkeeper book keeper spaces cause failures in comparison test1 test123 digits are not compared the end. the end…. punctuation differences do not matter. a elephant an elephant a and an are treated differently.
Transformation:DOUBLEMETAPHONE
function to generate phonetic spellings, as in the following:
Transformation Name
New formula
Parameter: Formula type
Single row formula
Parameter: Formula
DOUBLEMETAPHONE(string1)
Parameter: New column name
'dblmeta_s1'
string1
and string2
using the DOUBLEMETAPHONEEQUALS
function:
Transformation Name
New formula
Parameter: Formula type
Single row formula
Parameter: Formula
DOUBLEMETAPHONEEQUALS(string1, string2, 'normal')
Parameter: New column name
'compare'
string1 dblmeta_s1 string2 compare Notes My String ["MSTRNK","MSTRNK"] my string TRUE comparison is case-insensitive judge ["JJ","AJ"] juge TRUE typo knock ["NK","NK"] nock TRUE silent letters white ["AT","AT"] wite TRUE missing letters record ["RKRT","RKRT"] record TRUE two different words in English but match the same pair ["PR","PR"] pear TRUE these match but are different words. bookkeeper ["PKPR","PKPR"] book keeper FALSE spaces cause failures in comparison test1 ["TST","TST"] test123 TRUE digits are not compared the end. ["0NT","TNT"] the end…. TRUE punctuation differences do not matter. a elephant ["ALFNT","ALFNT"] an elephant FALSE a and an are treated differently.
This page has no comments.