Page tree



Contents:

The cloud-based version of Trifacta Wrangler is now available! Read all about it, and register for your free account.

This example illustrates how the following Double Metaphone algorithm functions operate in Trifacta® Wrangler.

  • DOUBLEMETAPHONE - Computes a primary and secondary phonetic encoding for an input string. Encodings are returned as a two-element array. See DOUBLEMETAPHONE Function.
  • DOUBLEMETAPHONEQUALS - Compares two input strings using the Double Metaphone algorithm. Returns true if they phonetically match. See DOUBLEMETAPHONEEQUALS Function.

Source:

The following table contains some example strings to be compared. 

string1string2notes
My Stringmy stringcomparison is case-insensitive
judgejugetypo
knocknocksilent letters
whitewitemissing letters
recordrecordtwo different words in English but match the same
pairpearthese match but are different words.
bookkeeperbook keeperspaces cause failures in comparison
test1test123digits are not compared
the end.the end….punctuation differences do not matter.
a elephantan elephanta and an are treated differently.


Transformation:

You can use the DOUBLEMETAPHONE function to generate phonetic spellings, as in the following:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula DOUBLEMETAPHONE(string1)
Parameter: New column name 'dblmeta_s1'

You can compare string1 and string2 using the DOUBLEMETAPHONEEQUALS function:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula DOUBLEMETAPHONEEQUALS(string1, string2, 'normal')
Parameter: New column name 'compare'

Results:

The following table contains some example strings to be compared. 

string1dblmeta_s1string2compareNotes
My String["MSTRNK","MSTRNK"]my stringTRUEcomparison is case-insensitive
judge["JJ","AJ"]jugeTRUEtypo
knock["NK","NK"]nockTRUEsilent letters
white["AT","AT"]witeTRUEmissing letters
record["RKRT","RKRT"]recordTRUEtwo different words in English but match the same
pair["PR","PR"]pearTRUEthese match but are different words.
bookkeeper["PKPR","PKPR"]book keeperFALSEspaces cause failures in comparison
test1["TST","TST"]test123TRUEdigits are not compared
the end.["0NT","TNT"]the end….TRUEpunctuation differences do not matter.
a elephant["ALFNT","ALFNT"]an elephantFALSE a and an are treated differently.

Your Rating: Results: 1 Star2 Star3 Star4 Star5 Star 12 rates

This page has no comments.