**Contents:**

**NOTE:** Transforms are a part of the underlying language, which is not directly accessible to users. This content is maintained for reference purposes only. For more information on the user-accessible equivalent to transforms, see Transformation Reference.

- This calculation can be based on all rows for totals across the dataset or based on group of rows you define in the transform.
- When your data has been pivoted, the unused columns are dropped, leaving only the operative data for faster evaluation.
- Transform accepts one or more columns as inputs to the pivot.

**Tip: **This transform is very useful in Preview mode for quick discovery and analysis.

## Basic Usage

pivot col: Dates value:SUM(Sales) group: prodId

**Output: **Reshapes your dataset to include a `ProdId`

column and new columns for each distinct value in the `Dates`

column. Each distinct `ProdId`

value is represented by a separate row in the reshaped dataset. Within each row, the new columns contain a sum of sales for the product for each date in the dataset.

## Syntax and Parameters

pivot col:column_ref value: FUNCTION(arg1,arg2) [group: group_col] [limit:int_num]

Token | Required? | Data Type | Description |
---|---|---|---|

pivot | Y | transform | Name of the transform |

col | N | string | Source column name or names |

value | N | string | Expression containing the aggregation function or functions used to pivot the source column data |

group | N | string | Column name or names containing the values by which to group for calculation |

limit | N | integer (positive) | Maximum number of unique values in a group. Default is 50. |

For more information on syntax standards, see Language Documentation Syntax Notes.

### col

Identifies the column or columns to which to apply the transform.

**Single-column example:**

The specified column contains the values that become new columns in the dataset.

pivot col: autoBrand value:AVERAGE(autoPrice) group:State

**Output: **Reshapes the dataset to contain the `State`

column followed by columns for each distinct value in the `autoBrand`

column. Each value in those columns is the average value in the `autoPrice`

column, as grouped by `State`

value.

**Multi-column example:**

For multiple input columns, the transform generates new columns for each combination of the inputs. See the example below.

**Usage Notes:**

Required? | Data Type |
---|---|

No | String (column name) |

### value

For this transform, the `value`

parameter contains the aggregation function and its parameters, which define the set of rows to which the function is applied.

**NOTE: **For the `value`

parameter, you can only use aggregation functions. Nested functions are not supported. For more information, see Aggregate Functions.

**Usage Notes:**

Required? | Data Type |
---|---|

No | String containing a list of aggregation functions each containing one column reference |

### group

For this transform, this parameter specifies the column whose values are used to group the dataset prior to applying the specified function. You can specify multiple columns as comma-separated values.

**NOTE:**Transforms that use the

`group`

parameter can result in non-deterministic re-ordering in the data grid.
However, if you're running your job on the Spark running environment, you should apply the `group`

parameter, or your job may run out of memory and fail. To avoid this issue and to enforce row ordering, use the `sort`

transform. For more information, see Sort Transform.

**Usage Notes:**

Required? | Data Type |
---|---|

No | String (column name) |

### limit

The `limit`

parameter defines the maximum number of unique values permitted in a group. If it is not specified, the default value for this parameter is `50`

.

**NOTE: **Be careful setting this parameter too high. In some cases, the application can run out of memory generating the results, and your results can fail.

**Usage Notes:**

Required? | Data Type |
---|---|

No. Default value is `50` . | Integer (positive) |

## Examples

**Tip:** For additional examples, see Common Tasks.

### Example - Basic Pivot

**Source:**

The following dataset contains information about sales across one weekend and two states for three different products.

Date | State | ProdId | Sales |
---|---|---|---|

3/9/16 | CA | Big Trike 9000 | 500 |

3/9/16 | NV | Fast GoKart 5000 | 200 |

3/9/16 | CA | SuperQuick Scooter | 700 |

3/9/16 | CA | Fast GoKart 5000 | 900 |

3/9/16 | NV | Big Trike 9000 | 300 |

3/9/16 | NV | SuperQuick Scooter | 250 |

3/10/16 | NV | Fast GoKart 5000 | 50 |

3/10/16 | NV | Big Trike 9000 | 400 |

3/10/16 | NV | SuperQuick Scooter | 150 |

3/10/16 | CA | Big Trike 9000 | 600 |

3/10/16 | CA | SuperQuick Scooter | 800 |

3/10/16 | CA | Fast GoKart 5000 | 1100 |

**Transform 1: **Sum of sales by State for each Date

Apply this transform in Preview only, just so you can see the results:

pivot col:Date value:SUM(Sales) group:State

Cancel the transform update.

State | sum_Sales_03/09/2016 | sum_Sales_03/10/2016 |
---|---|---|

CA | 2100 | 2500 |

NV | 750 | 600 |

** Transform 2: **Sum of Sales by Date for each State

pivot col:State value:SUM(Sales) group:Date

Date | sum_Sales_CA | sum_Sales_NV |
---|---|---|

03/09/2016 | 2100 | 750 |

03/10/2016 | 2500 | 600 |

Cancel the transform update again.

**Transform 3: **Sum of Sales by product ID for each State

pivot col:State value:SUM(Sales) group:ProdId

ProdId | sum_Sales_CA | sum_Sales_NV |
---|---|---|

Big Trike 9000 | 1100 | 700 |

Fast GoKart 5000 | 2000 | 500 |

SuperQuick Scooter | 1500 | 300 |

### Example - Multi-column Pivot

The `pivot`

transform supports the ability to use the values in multiple columns to specify the columns that are generated. When two or more columns are used in the pivot, columns containing values of all possible combinations from the source columns are generated.

In the following source data, sales data on individual products is organized by brand, product name, and month:

Brand | Product | Month | Sales |
---|---|---|---|

AAA | Towels | January | 95 |

AAA | Napkins | January | 113 |

B | Towels | January | 99 |

B | Tissues | January | 88 |

AAA | Towels | February | 108 |

AAA | Napkins | February | 91 |

B | Towels | February | 85 |

B | Tissues | February | 105 |

AAA | Towels | March | 81 |

AAA | Napkins | March | 92 |

B | Towels | March | 112 |

B | Tissues | March | 104 |

**Transform:**

If you wanted to create summary sales information for each product by month, you might choose to create a `pivot`

on the `Product`

column. However, if you look at the column values, you might notice that the `Paper towels`

product is available for both the `AAA`

and `B`

brands. In this case, you must perform a multi-column pivot on these two columns, like the following:

pivot col:Brand,Product value:AVERAGE(Sales) group:Month

**Results:**

Month | avg_Sales_B_Towels | avg_Sales_B_Tissues | avg_Sales_AAA_Napkins | avg_Sales_AAA_Towels |
---|---|---|---|---|

January | 99 | 88 | 113 | 95 |

February | 85 | 105 | 91 | 108 |

March | 112 | 104 | 92 | 81 |

### Example - Aggregate Values

You can use the `pivot`

transform to perform aggregation calculations on values in a column.

**Source:**

In the following table, you can review test results from three different tests for 10 students:

StudentId | TestId | Score |
---|---|---|

s001 | t001 | 98 |

s001 | t002 | 98 |

s001 | t003 | 87 |

s002 | t001 | 92 |

s002 | t002 | 96 |

s002 | t003 | 79 |

s003 | t001 | 99 |

s003 | t002 | 76 |

s003 | t003 | 94 |

s004 | t001 | 93 |

s004 | t002 | 99 |

s004 | t003 | 80 |

s005 | t001 | 79 |

s005 | t002 | 80 |

s005 | t003 | 84 |

s006 | t001 | 93 |

s006 | t002 | 74 |

s006 | t003 | 89 |

s007 | t001 | 86 |

s007 | t002 | 81 |

s007 | t003 | 97 |

s008 | t001 | 82 |

s008 | t002 | 73 |

s008 | t003 | 79 |

s009 | t001 | 96 |

s009 | t002 | 97 |

s009 | t003 | 79 |

s010 | t001 | 80 |

s010 | t002 | 79 |

s010 | t003 | 75 |

**Transform:**

For each student, you're interested in a student's average score and maximum score. You create the following pivot transform, which groups computations of average score and maximum score columns by `studentId`

value.

pivot group: StudentId value: average(Score),max(Score) limit: 50

For the generated average column, you want results rounded to two decimal places:

set col: average_Score value: round(average_Score, 2)

**Results:**

StudentId | average_Score | max_Score |
---|---|---|

s001 | 94.33 | 98 |

s002 | 89 | 96 |

s003 | 89.67 | 99 |

s004 | 90.67 | 99 |

s005 | 81 | 84 |

s006 | 85.33 | 93 |

s007 | 88 | 97 |

s008 | 78 | 82 |

s009 | 90.67 | 97 |

s010 | 78 | 80 |

This page has no comments.