Join Types
The following types of joins are supported. For example, the following tables contains information about employees and departments.
Employee table:
Name | DepartmentID | Role |
---|---|---|
Dave Smith | 001 | Product Marketing Manager |
Julie Jones | 002 | Software Engineer |
Scott Tanner | 001 | Director of Demand Gen |
Ted Connors | 002 | Software Engineer |
Margaret Lane | 001 | VP of Marketing |
Mary Martin | 004 | Receptionist |
Department table:
Name | DepartmentID |
---|---|
Marketing | 001 |
Engineering | 002 |
Accounting | 003 |
In the above example, DepartmentID
is the key to use in both tables for any joins.
Inner Join
An inner join requires that key values exist in both tables for the records to appear in the results table. Records appear in the merge only if there are matches in both tables for the key values.
If you want to include rows containing non-matching values, you must use some form of an outer join. See below.
For the preceding example tables, an inner join on the DepartmentID
table produces the following result table:
Employee.Name | Employee.DepartmentID | Employee.Role | Department.Name | Department.DepartmentID |
---|---|---|---|---|
Dave Smith | 001 | Product Marketing Manager | Marketing | 001 |
Julie Jones | 002 | Software Engineer | Engineering | 002 |
Scott Tanner | 001 | Director of Demand Gen | Marketing | 001 |
Ted Connors | 002 | Software Engineer | Engineering | 002 |
Margaret Lane | 001 | VP of Marketing | Marketing | 001 |
Notes:
All fields are included in the merged result set. Fields from the first dataset are listed first.
The row for Mary Martin is excluded, since there is no reference in the Department table for her department identifier. The row for Accounting is excluded, since there is no reference in the Employee table for the department identifier.
To include these rows, you either need to augment the data or perform a form of an outer join.
A null value in one table does not match a null value in another table. So, rows with null values in a join key are never included in an inner join. These values should be fixed.
Dica
An inner join can be used to eliminate rows with null values in their key fields.
Left Join
A left join (or left outer join) does not require that there be matching records for each value in the key value of the source (left) table. Each row in the left table appears in the results, regardless of whether there are matches in the right table.
For the preceding example tables, a left join on the DepartmentID
table produces the following result table:
Employee.Name | Employee.DepartmentID | Employee.Role | Department.Name | Department.DepartmentID |
---|---|---|---|---|
Dave Smith | 001 | Product Marketing Manager | Marketing | 001 |
Julie Jones | 002 | Software Engineer | Engineering | 002 |
Scott Tanner | 001 | Director of Demand Gen | Marketing | 001 |
Ted Connors | 002 | Software Engineer | Engineering | 002 |
Margaret Lane | 001 | VP of Marketing | Marketing | 001 |
Mary Martin | 004 | Receptionist | NULL | NULL |
Notes:
In this left join, the Mary Martin row has been added to the result, since her record in the Employee table does contain an entry for the
DepartmentID
. However, since there are no corresponding values in the Department table, the corresponding fields in the result table areNULL
values.
Right Join
A right join (or right outer join) is the reverse of a left join. A right join does not require that there be matching records for each value in the key value of the secondary (right) table. Each row in the right table appears in the results, regardless of whether there are matches in the left table.
For the preceding example tables, a right join on the DepartmentID
table produces the following result table:
Employee.Name | Employee.DepartmentID | Employee.Role | Department.Name | Department.DepartmentID |
---|---|---|---|---|
Dave Smith | 001 | Product Marketing Manager | Marketing | 001 |
Julie Jones | 002 | Software Engineer | Engineering | 002 |
Scott Tanner | 001 | Director of Demand Gen | Marketing | 001 |
Ted Connors | 002 | Software Engineer | Engineering | 002 |
Margaret Lane | 001 | VP of Marketing | Marketing | 001 |
NULL | NULL | NULL | Accounting | 003 |
Notes:
In this right join, the Accounting entry is added. However, since there is no entry in the Employee table for the
DepartmentID
value, those fields areNULL
values in the result set.
Full Outer Join
A full outer join combines the effects of a left join and a right join. If there is a match between the key values, a row is written in the result.
If there is no match for a key value that appears in either table, a single record is written to the result, with
NULL
values inserted for the fields from the other table.
Employee.Name | Employee.DepartmentID | Employee.Role | Department.Name | Department.DepartmentID |
---|---|---|---|---|
Dave Smith | 001 | Product Marketing Manager | Marketing | 001 |
Julie Jones | 002 | Software Engineer | Engineering | 002 |
Scott Tanner | 001 | Director of Demand Gen | Marketing | 001 |
Ted Connors | 002 | Software Engineer | Engineering | 002 |
Margaret Lane | 001 | VP of Marketing | Marketing | 001 |
Mary Martin | 004 | Receptionist | NULL | NULL |
NULL | NULL | NULL | Accounting | 003 |
Notes:
Any duplicated rows between joining from left-to-right and from right-to-left are removed from the results.
Cross Join
A cross join combines each row of the first data set with each row of the second dataset, where every combination is represented in the output. As a result, the number of total rows in the join are:
Rows(DatasetA) * Rows(DatasetB)
Nota
Depending on the size of your datasets, a cross join can greatly expand the size of the output, which may increase costs in some environments.
Self Join
A self join is a join operation between a dataset and a copy of itself. For example, you can use a self-join to invert the structure of hierarchical data, such as brand-product or manager-employee.
Designer Cloud supports joins between a recipe and any upstream recipe or dataset.
You cannot join a recipe to itself.
You can join it to its source imported dataset.When a self-join is performed with a recipe connected to its source dataset, only one line connects the imported dataset with the recipe in Flow View. This is as designed.
You can join a recipe to any recipe upstream of it. Examples:
You can create an empty recipe after the recipe from which you wish to self-join. In this new empty recipe, you add the join step back to the original recipe.
You can insert an empty recipe between an imported dataset and the recipe where the self-join is performed. When you perform the self join in the first recipe, you join to the empty recipe you just created in between.
In both examples, you can see multiple lines in Flow View to indicate the self-join.
Joins Together
The following diagram summarizes the relationships between the types of supported joins. In each venn diagram, the area of intersection is the set of records that contain shared key values.