Excerpt |
---|
This example describes how to generate random array (list) data and then to apply statistical functions specifically created for arrays. |
Functions:
D generate list excerpts |
---|
pages | LISTSUM Function,LISTMIN Function,LISTMAX Function,LISTAVERAGE Function,LISTVAR Function,LISTSTDEV Function,LISTMODE Function |
---|
|
Also:
D generate list excerpts |
---|
pages | RANGE Function,RAND Function,ROUND Function |
---|
|
Source:
For this example, you can generate some randomized data using the following steps. First, you need to seed an array with a range of values using the RANGE function:
D trans |
---|
RawWrangle | true |
---|
p03Value | 'myArray1' |
---|
Type | step |
---|
WrangleText | derive type: single value: RANGE(5, 50, 5) as: 'myArray1' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | RANGE(5, 50, 5) |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
Then, unpack this array, so you can add a random factor:
D trans |
---|
RawWrangle | true |
---|
p03Value | true |
---|
Type | step |
---|
WrangleText | unnest col: myArray1 keys: '[0]', '[1]', '[2]', '[3]', '[4]', '[5]', '[6]', '[7]', '[8]', '[9]' pluck: true markLineage: true |
---|
p01Name | Column |
---|
p01Value | myArray1 |
---|
p02Name | Paths to elements |
---|
p02Value | '[0]', '[1]', '[2]', '[3]', '[4]', '[5]', '[6]', '[7]', '[8]', '[9]' |
---|
p03Name | Remove elements from original |
---|
p04Value | true |
---|
p04Name | Include original column name |
---|
SearchTerm | Unnest Objects into columns |
---|
|
Add the randomizing factor. Here, you are adding randomization around individual values: x-1 < x < x+4.
D trans |
---|
RawWrangle | true |
---|
Type | step |
---|
WrangleText | set col: myArray1_0~myArray1_8 value: IF(RAND() > 0.5, $col + (5 * RAND()), $col - RAND()) |
---|
p01Name | Columns |
---|
p01Value | myArray1_0~myArray1_8 |
---|
p02Name | Formula |
---|
p02Value | IF(RAND() > 0.5, $col + (5 * RAND()), $col - RAND()) |
---|
SearchTerm | Edit column with formula |
---|
|
To make the numbers easier to manipulate, you can round them to two decimal places:
D trans |
---|
RawWrangle | true |
---|
Type | step |
---|
WrangleText | set col: myArray1_0~myArray1_8 value: ROUND($col, 2) |
---|
p01Name | Columns |
---|
p01Value | myArray1_0~myArray1_8 |
---|
p02Name | Formula |
---|
p02Value | ROUND($col, 2) |
---|
SearchTerm | Edit column with formula |
---|
|
Renest these columns into an array:
D trans |
---|
RawWrangle | true |
---|
p03Value | 'myArray2' |
---|
Type | step |
---|
WrangleText | nest col: myArray1_0, myArray1_1, myArray1_2, myArray1_3, myArray1_4, myArray1_5, myArray1_6, myArray1_7, myArray1_8 into: array as: 'myArray2' |
---|
p01Name | Columns |
---|
p01Value | myArray1_0, myArray1_1, myArray1_2, myArray1_3, myArray1_4, myArray1_5, myArray1_6, myArray1_7, myArray1_8 |
---|
p02Name | Nest columns to |
---|
p02Value | Array |
---|
p03Name | New column name |
---|
SearchTerm | Nest columns into Objects |
---|
|
Delete the unused columns:
D trans |
---|
RawWrangle | true |
---|
Type | step |
---|
WrangleText | drop col: myArray1_0~myArray1_8,myArray1 action: Drop |
---|
p01Name | Columns |
---|
p01Value | myArray1_0~myArray1_8,myArray1 |
---|
p02Name | Action |
---|
p02Value | Delete selected columns |
---|
SearchTerm | Delete columns |
---|
|
Your data should look similar to the following:
myArray2 |
---|
["8.29","9.63","14.63","19.63","24.63","29.63","34.63","39.63","44.63"] |
["8.32","14.01","19.01","24.01","29.01","34.01","39.01","44.01","49.01"] |
["4.55","9.58","14.58","19.58","24.58","29.58","34.58","39.58","44.58"] |
["9.22","14.84","19.84","24.84","29.84","34.84","39.84","44.84","49.84"] |
["8.75","13.36","18.36","23.36","28.36","33.36","38.36","43.36","48.36"] |
["8.47","14.76","19.76","24.76","29.76","34.76","39.76","44.76","49.76"] |
["4.93","9.99","14.99","19.99","24.99","29.99","34.99","39.99","44.99"] |
["4.65","14.98","19.98","24.98","29.98","34.98","39.98","44.98","49.98"] |
["7.80","14.62","19.62","24.62","29.62","34.62","39.62","44.62","49.62"] |
["9.32","9.96","14.96","19.96","24.96","29.96","34.96","39.96","44.96"] |
Transformation:
These steps demonstrate the individual math functions that you can apply to your list data without unnesting it:
Info |
---|
NOTE: The NUMFORMAT function has been wrapped around each list function to account for any floating-point errors or additional digits in the results. |
Sum of all values in the array (list):
D trans |
---|
RawWrangle | true |
---|
p03Value | 'arraySum' |
---|
Type | step |
---|
WrangleText | derive type: single value: NUMFORMAT(LISTSUM(myArray2), '#.##') as: 'arraySum' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | NUMFORMAT(LISTSUM(myArray2), '#.##') |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
Minimum of all values in the array (list):
D trans |
---|
RawWrangle | true |
---|
p03Value | 'arrayMin' |
---|
Type | step |
---|
WrangleText | derive type: single value: NUMFORMAT(LISTMIN(myArray2), '#.##') as: 'arrayMin' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | NUMFORMAT(LISTMIN(myArray2), '#.##') |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
Maximum of all values in the array (list):
D trans |
---|
RawWrangle | true |
---|
p03Value | 'arrayMax' |
---|
Type | step |
---|
WrangleText | derive type: single value: NUMFORMAT(LISTMAX(myArray2), '#.##') as: 'arrayMax' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | NUMFORMAT(LISTMAX(myArray2), '#.##') |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
Average of all values in the array (list):
D trans |
---|
RawWrangle | true |
---|
p03Value | 'arrayAvg' |
---|
Type | step |
---|
WrangleText | derive type: single value: NUMFORMAT(LISTAVERAGE(myArray2), '#.##') as: 'arrayAvg' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | NUMFORMAT(LISTAVERAGE(myArray2), '#.##') |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
Variance of all values in the array (list):
D trans |
---|
RawWrangle | true |
---|
p03Value | 'arrayVar' |
---|
Type | step |
---|
WrangleText | derive type: single value: NUMFORMAT(LISTVAR(myArray2), '#.##') as: 'arrayVar' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | NUMFORMAT(LISTVAR(myArray2), '#.##') |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
Standard deviation of all values in the array (list):
D trans |
---|
RawWrangle | true |
---|
p03Value | 'arrayStDv' |
---|
Type | step |
---|
WrangleText | derive type: single value: NUMFORMAT(LISTSTDEV(myArray2), '#.##') as: 'arrayStDv' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | NUMFORMAT(LISTSTDEV(myArray2), '#.##') |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
Mode (most common value) of all values in the array (list):
D trans |
---|
RawWrangle | true |
---|
p03Value | 'arrayMode' |
---|
Type | step |
---|
WrangleText | derive type: single value: NUMFORMAT(LISTMODE(myArray2), '#.##') as: 'arrayMode' |
---|
p01Name | Formula type |
---|
p01Value | Single row formula |
---|
p02Name | Formula |
---|
p02Value | NUMFORMAT(LISTMODE(myArray2), '#.##') |
---|
p03Name | New column name |
---|
SearchTerm | New formula |
---|
|
Results:
Results for the first four math functions:
myArray2 | arrayAvg | arrayMax | arrayMin | arraySum |
---|
["8.29","9.63","14.63","19.63","24.63","29.63","34.63","39.63","44.63"] | 25.04 | 44.63 | 8.29 | 225.33 |
["8.32","14.01","19.01","24.01","29.01","34.01","39.01","44.01","49.01"] | 28.93 | 49.01 | 8.32 | 260.4 |
["4.55","9.58","14.58","19.58","24.58","29.58","34.58","39.58","44.58"] | 24.58 | 44.58 | 4.55 | 221.19 |
["9.22","14.84","19.84","24.84","29.84","34.84","39.84","44.84","49.84"] | 29.77 | 49.84 | 9.22 | 267.94 |
["8.75","13.36","18.36","23.36","28.36","33.36","38.36","43.36","48.36"] | 28.4 | 48.36 | 8.75 | 255.63 |
["8.47","14.76","19.76","24.76","29.76","34.76","39.76","44.76","49.76"] | 29.62 | 49.76 | 8.47 | 266.55 |
["4.93","9.99","14.99","19.99","24.99","29.99","34.99","39.99","44.99"] | 24.98 | 44.99 | 4.93 | 224.85 |
["4.65","14.98","19.98","24.98","29.98","34.98","39.98","44.98","49.98"] | 29.39 | 49.98 | 4.65 | 264.49 |
["7.80","14.62","19.62","24.62","29.62","34.62","39.62","44.62","49.62"] | 29.42 | 49.62 | 7.8 | 264.76 |
["9.32","9.96","14.96","19.96","24.96","29.96","34.96","39.96","44.96"] | 25.44 | 44.96 | 9.32 | 229 |
Results for the statistical functions:
myArray2 | arrayMode | arrayStDv | arrayVar |
---|
["8.29","9.63","14.63","19.63","24.63","29.63","34.63","39.63","44.63"] | | 12.32 | 151.72 |
["8.32","14.01","19.01","24.01","29.01","34.01","39.01","44.01","49.01"] | | 13.03 | 169.78 |
["4.55","9.58","14.58","19.58","24.58","29.58","34.58","39.58","44.58"] | | 12.92 | 166.8 |
["9.22","14.84","19.84","24.84","29.84","34.84","39.84","44.84","49.84"] | | 13.02 | 169.46 |
["8.75","13.36","18.36","23.36","28.36","33.36","38.36","43.36","48.36"] | | 12.84 | 164.95 |
["8.47","14.76","19.76","24.76","29.76","34.76","39.76","44.76","49.76"] | | 13.14 | 172.56 |
["4.93","9.99","14.99","19.99","24.99","29.99","34.99","39.99","44.99"] | | 12.92 | 166.93 |
["4.65","14.98","19.98","24.98","29.98","34.98","39.98","44.98","49.98"] | | 13.9 | 193.16 |
["7.80","14.62","19.62","24.62","29.62","34.62","39.62","44.62","49.62"] | | 13.23 | 175.08 |
["9.32","9.96","14.96","19.96","24.96","29.96","34.96","39.96","44.96"] | | 12.21 | 149.17 |
Since all values are unique within an individual array, there is no most common value in any of them, which yields empty values for the arrayMode
column.
D s also |
---|
label | example_list_math_functions |
---|
|