Page tree

 

Support | BlogContact Us | 844.332.2821

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. If your primary key spans multiple columns, use the merge transform to bring the values into a single column:

    D code

    merge col:RestaurantName,Address,Zip with:'-'

  2. Rename the generated column: PrimaryKey.

  3. From the PrimaryKey column drop-down, select Sort ascending.

    Info

    NOTE: If it is important to keep the first entry of a specific column in your dataset, you can add that column as a secondary sort column, as in the following transform:

    D code

    sort order: PrimaryKey,-YearsMember

    The dash before the secondary column indicates to sort in descending order.

  4. Use the following transform to generate a new column, comparing each value in the PrimaryKey column to the previous one:

    D code

    window value: PREV(PrimaryKey, 1) order: PrimaryKey

  5. For each row, the value of the new column is the value in the PrimaryKey for the previous row. Now, test if this value is the same as the value in the PrimaryKey column for the current row:

    D code

    derive value:((window==PrimaryKey) ? true : false)

  6. The new column (IsDupe) contains true for duplicate primary keys. Delete the rows that are duplicates:

    D code

    delete row:(IsDupe==true)

  7. Drop any generated columns that are no longer needed. 

 

Deduplicate Columns

While this form of duplicate data is rarer, you might want to check on the possibility of duplicate data between your columns. To check for duplicate column data, you can use a transform similar to the following:

...