In most join operations, the values in primary keys across two tables must match exactly for the related columns to be included in the join. In a range join, you can change the comparative operator for the keys from Equal to one that specifies a range of matching values.
Comparitive Comparative operators:
- Not equal to
- Greater than
- Greater than or equal to
- Less than
- Less than or equal to
Range joins allow you to include many more matching values and therefore rows in the join. Depending on the matches and the included columns, your resulting dataset can become very large. You should use this feature with some caution.
- Range joins apply only keys whose data types can be compared.
For example, for joins involving keys of Binary data type, you can use Equal to or Not equal to joins.
Tip: Range joins cannot be applied to Datetime data type values directly. However, if you convert the values to numeric Unix time values, you should be able to specify a range join. For more information, see UNIXTIME Function.
- Any range comparison that includes one or more string columns as keys uses the string comparison greater/less than, not the numerical comparison.
Enable:After range joins have been enabled, you can specify them as part of performing any join operation.
This feature may need to be enabled by a workspace administrator.See Workspace Dataprep Project Settings Page.
After range joins have been enabled, you can specify them as part of performing any join operation.
- In the Search panel, enter
join datasetsin the search box.
- Select the dataset with which to join the current one. Then, click Accept.
- In the Join window, select the join type.
- In the Join Keys area, click the Pencil icon.
- Specify the fields in the current dataset and the joined-in dataset.
From the Condition drop-down, select the range operator to use:
Select range operator
Specify other properties for the matching keys.
Click Save and Continue.
Specify other elements of the join. When finished, click Add to Recipe.