- Review your record counts. Before you specify the join, you should review your record counts and the uniqueness of your keys, which should provide an idea of the number of records you may see in the output. Note that the number of output records depends on the type of join and the matches between join keys.
- Review your join key values. If there are variations in the values in your join keys, you may end up with duplicate records in your joined dataset. Look for mismatched or missing values in your join keys, and correct if possible.
- Review the granularity of your data. If you bring together data at a lower fidelity than the source, you can end up with record matches that are not actually matching data. For example, if your timestamps are down-sampled from milliseconds to seconds as part of the join, you may have "matching" timestamps in seconds that were not matches at the millisecond level in the source data.
Step 1 - Select Dataset
In the Search panel, enter
These following options are applied to the join key columns in both sources to attempt to find matches. After the join is executed, no data in either column is changed based on these selections.
|Use metaphonefuzzy match|
Use the metaphone a fuzzy matching algorithm for key value matching with the
Fuzzy matching uses the doublemetaphone algorithm for matching strings (keys). Both primary encodings of each key value must match. See DOUBLEMETAPHONEEQUALS Function.
|Ignore case||Ignore case differences between the join key values for matching purposes.|
|Ignore special characters||Ignore all characters that are not alphanumeric, accented Latin characters, or whitespace, prior to testing for a match.|
|Ignore whitespace||Ignore all whitespace characters, including spaces, tabs, carriage returns, and newlines.|
To add the specified join to your recipe, click Add to Recipe.