NOTE: Transforms are a part of the underlying language that is not directly accessible to users. This content is maintained for reference purposes only.
NOTE: This transform is intended primarily for use in the Transformer page. Sort order may not be preserved in the output files.
- If you generate a new sample after a
sorttransform has been applied, the sort order is not retained. You can re-apply the sort step, although the following limitations still apply.
- Sort order is not preserved on output when the output is a multi-part file.
Output: Dataset is sorted in alphabetically ascending order based on the values in the
LastName column, assuming that the values are strings.
|sort||Y||transform||Name of the transform|
|order||Y||string||Name of column or columns by which to sort|
For more information on syntax standards, see Language Documentation Syntax Notes.
Identifies the column or set of columns by which the dataset is sorted.
- Multiple column names can be separated by commas.
- Ranges of columns cannot be specified.
The order can be reversed by adding a negative sign in front of the column name:
Multi-column sorts: You can also specify multi-column sorts. The following example sorts first by the inverse order of
ProductName and within that sort, rows are sorted by
Sort by original row numbers: As an input value, this parameter also accepts the
SOURCEROWNUMBER function, which performs the sort according to the original order of rows when the dataset was created.
|Yes||String (column name)|
Data is sorted based on the data type of the source:
|Data Type of Source||Sort Order|
Example - sort methods
The column without a name identifies the original row numbers. In the data grid, this information is available when you hover over the black dot to the left of a row of data.
First, you might want to clean up the number formatting in the
lastOrder column. The following formats the values to always include two digits after the decimal point:
Now, you're interested in the highest value for your customers' most recent orders. You can apply the following sort:
Rows are sorted by the
LastOrder column in descending order (largest to smallest):
The above row numbers represent the original order of the rows. Now, you want to get your data geographically organized by sorting by city and state. You can perform multi-column sorts such as the following, which sorts first by
State and then by
In the generated output, the data is first sorted by the
State value. Each set of rows within the same
State value is also sorted by the
Example - Sort by original row numbers
You have imported the following racer data on heat times from a CSV file. When loaded in the Transformer page, it looks like the following:
|1||Racer||Heat 1||Heat 2||Heat 3|
In the above, the
(rowId) column references the row numbers displayed in the data grid; it is not part of the dataset. This information is available when you hover over the black dot on the left side of the screen.
You have examined the best performance in each heat according to the sample. You then notice that the data contains headers, but you forget how it was originally sorted. The data now looks like the following:
|2||Racer||Heat 1||Heat 2||Heat 3|
While you can undo your sort steps to return to the original sort order, this approach works best if you did not include other steps in between that are based on the sort order.
If you have steps that require retaining your sort steps, you can revert to the original sort order by adding this transform step:
NOTE: Source row information may be lost after operations such as joins, unions, and aggregations are performed. In these cases, you cannot sort by the source row information. You may be able to generate a column of source row number earlier in your recipe.
Then, you can create the header with the following simple step:
After you have applied the last
header transform, your data should look like the following:
You can sort by the
Racer column in ascending order to return to the original sort order.
This page has no comments.