The |
Optionally, you can insert an integer as a parameter.
Column references or functions returning Integer values are not supported.
Example:
rand() |
Output: For each row, generate a random number between 0 and 1 in the new random
function.
Example with seed value:
rand(2) |
Output: For each row, generate a random number between 0 and 1 in the new random
function. The generated random set of random values are consistent between job executions and are, in part, governed by the seed value 2
.
There are no arguments for this function.
rand([int_value]) |
Argument | Required? | Data Type | Description |
---|---|---|---|
int_value | N | integer | Integer literal |
Optional Integer literal that is used to generate random numbers. Use of a seed value ensures consistency of output between job executions.
Required? | Data Type | Example Value |
---|---|---|
No | Integer literal | 14
|
In the following example, the random
column is generated by the RAND
function:
source | random |
---|---|
A |
|
B |
|
C |
|
D |
|
The RAND
function is typically used to introduce randomness of some kind in your data. In the following example, it is used to perform sampling within your wider dataset.
Source:
You want to extract a random sample of 20% of your set of orders for further study:
OrderId | Qty | ProdId |
---|---|---|
1001 | 30 | Widgets |
1002 | 10 | Big Widgets |
1003 | 5 | Big Widgets |
1004 | 40 | Widgets |
1005 | 80 | Tiny Widgets |
1006 | 20 | Widgets |
1007 | 100 | Tiny Widgets |
Transformation:
You can use the following transform to generate a random integer from one to 10:
You can now use the following transform to keep only the rows that contain random values that are in the top 20%, where the value is 9
or 10
:
Results:
NOTE: Since the results are randomized, your results might vary. |
OrderId | Qty | ProdId | random |
---|---|---|---|
1005 | 80 | Tiny Widgets | 9 |
1007 | 100 | Tiny Widgets | 10 |