Page tree

 

Support | BlogContact Us | 844.332.2821

 

Contents:

The cloud-based version of Trifacta Wrangler is now available! Read all about it, and register for your free account.

Contents:


The following types of joins are supported. For example, the following tables contains information about employees and departments.

Employee table:

NameDepartmentIDRole
Dave Smith001Product Marketing Manager
Julie Jones002Software Engineer
Scott Tanner001Director of Demand Gen
Ted Connors002Software Engineer
Margaret Lane001VP of Marketing
Mary Martin004Receptionist

Department table:

NameDepartmentID
Marketing001
Engineering002
Accounting003

In the above example, DepartmentID is the key to use in both tables for any joins.

Inner Join

An inner join requires that key values exist in both tables for the records to appear in the results table. Records appear in the merge only if there are matches in both tables for the key values.

  • If you want to include rows containing non-matching values, you must use some form of an outer join. See below.

For the preceding example tables, an inner join on the DepartmentID table produces the following result table:

Employee.NameEmployee.DepartmentIDEmployee.RoleDepartment.NameDepartment.DepartmentID
Dave Smith001Product Marketing ManagerMarketing001
Julie Jones002Software EngineerEngineering002
Scott Tanner001Director of Demand GenMarketing001
Ted Connors002Software EngineerEngineering002
Margaret Lane001VP of MarketingMarketing001

Notes:

  • All fields are included in the merged result set. Fields from the first dataset are listed first.
  • The row for Mary Martin is excluded, since there is no reference in the Department table for her department identifier. The row for Accounting is excluded, since there is no reference in the Employee table for the department identifier.
    • To include these rows, you either need to augment the data or perform a form of an outer join.
  • A null value in one table does not match a null value in another table. So, rows with null values in a join key are never included in an inner join. These values should be fixed.

    Tip: An inner join can be used to eliminate rows with null values in their key fields.

Left Join

A left join (or left outer join) does not require that there be matching records for each value in the key value of the source (left) table. Each row in the left table appears in the results, regardless of whether there are matches in the right table.

For the preceding example tables, a left join on the DepartmentID table produces the following result table:

Employee.NameEmployee.DepartmentIDEmployee.RoleDepartment.NameDepartment.DepartmentID
Dave Smith001Product Marketing ManagerMarketing001
Julie Jones002Software EngineerEngineering002
Scott Tanner001Director of Demand GenMarketing001
Ted Connors002Software EngineerEngineering002
Margaret Lane001VP of MarketingMarketing001
Mary Martin004ReceptionistNULLNULL

Notes:

  • In this left join, the Mary Martin row has been added to the result, since her record in the Employee table does contain an entry for the DepartmentID. However, since there are no corresponding values in the Department table, the corresponding fields in the result table are NULL values.

Right Join

A right join (or right outer join) is the reverse of a left join. A right join does not require that there be matching records for each value in the key value of the secondary (right) table. Each row in the right table appears in the results, regardless of whether there are matches in the left table.

For the preceding example tables, a right join on the DepartmentID table produces the following result table:

Employee.NameEmployee.DepartmentIDEmployee.RoleDepartment.NameDepartment.DepartmentID
Dave Smith001Product Marketing ManagerMarketing001
Julie Jones002Software EngineerEngineering002
Scott Tanner001Director of Demand GenMarketing001
Ted Connors002Software EngineerEngineering002
Margaret Lane001VP of MarketingMarketing001
NULLNULLNULLAccounting003

Notes:

  • In this right join, the Accounting entry is added. However, since there is no entry in the Employee table for the DepartmentID value, those fields are NULL values in the result set.

Full Outer Join

full outer join combines the effects of a left join and a right join. If there is a match between the key values, a row is written in the result.

  • If there is no match for a key value that appears in either table, a single record is written to the result, with NULL values inserted for the fields from the other table. 
Employee.NameEmployee.DepartmentIDEmployee.RoleDepartment.NameDepartment.DepartmentID
Dave Smith001Product Marketing ManagerMarketing001
Julie Jones002Software EngineerEngineering002
Scott Tanner001Director of Demand GenMarketing001
Ted Connors002Software EngineerEngineering002
Margaret Lane001VP of MarketingMarketing001
Mary Martin004ReceptionistNULLNULL
NULLNULLNULLAccounting003

Notes:

  • Any duplicated rows between joining from left-to-right and from right-to-left are removed from the results.

Cross Join

cross join combines each row of the first data set with each row of the second dataset, where every combination is represented in the output. As a result, the number of total rows in the join are:

Rows(DatasetA) * Rows(DatasetB)

NOTE: Depending on the size of your datasets, a cross join can greatly expand the size of the output, which may increase costs in some environments.

Joins Together

The following diagram summarizes the relationships between the types of supported joins. In each venn diagram, the area of intersection is the set of records that contain shared key values.

Figure: Join Types

After you have created a join, you can modify it through the Recipe panel. For more information, see Edit a Join.

Your Rating: Results: PatheticBadOKGoodOutstanding! 2 rates

This page has no comments.