## Key

• This line was removed.
• Formatting was changed.

This example illustrates how you can apply statistical functions to your dataset. Calculations include average (mean), max, min, standard deviation, and variance.

Source:

Students took a test and recorded the following scores. You want to perform some statistical analysis on them:

StudentScore
Anna84
Ben71
Caleb76
Danielle87
Evan85
Faith92
Gabe85
Hannah99
Ian73
Jane68

Transformation:

You can use the following transformations to calculate the average (mean), minimum, and maximum scores:

D trans
RawWrangle true 'avgScore' step derive type:single value:AVERAGE(Score) as:'avgScore' Formula type Single row formula Formula AVERAGE(Score) New column name New formula

D trans
RawWrangle true 'minScore' step derive type:single value:MIN(Score) as:'minScore' Formula type Single row formula Formula MIN(Score) New column name New formula

D trans
RawWrangle true 'maxScore' step derive type:single value:MAX(Score) as:'maxScore' Formula type Single row formula Formula MAX(Score) New column name New formula

To apply statistical functions to your data, you can use the `VAR` and `STDEV` functions, which can be used as the basis for other statistical calculations.

D trans
RawWrangle true var_Score step derive type:single value:VAR(Score) as:'var_Score' Formula type Single row formula Formula VAR(Score) New column name New formula

D trans
RawWrangle true stdev_Score step derive type:single value:STDEV(Score) as:'stdev_Score' Formula type Single row formula Formula STDEV(Score) New column name New formula

For each score, you can now calculate the variation of each one from the average, using the following:

D trans
RawWrangle true 'stDevs' step derive type:single value:((Score - avg_Score) / stdev_Score) as:'stDevs' Formula type Single row formula Formula ((Score - avg_Score) / stdev_Score) New column name New formula

Now, you want to apply grades based on a formula:

AstDevs > 1
BstDevs > 0.5
C-1 <= stDevs <= 0.5
DstDevs < -1
FstDevs < -2

You can build the following transformation using the `IF` function to calculate grades.

D trans
RawWrangle true step derive type:single value:IF((stDevs > 1),'A',IF((stDevs < -2),'F',IF((stDevs < -1),'D',IF((stDevs > 0.5),'B','C')))) Formula type Single row formula Formula IF((stDevs > 1),'A',IF((stDevs < -2),'F',IF((stDevs < -1),'D',IF((stDevs > 0.5),'B','C')))) New formula

To clean up the content, you might want to apply some formatting to the score columns. The following reformats the `stdev_Score` and `stDevs` columns to display two decimal places:

D trans
RawWrangle true step set col:stdev_Score value:NUMFORMAT(stdev_Score, '##.00') Columns stdev_Score Formula NUMFORMAT(stdev_Score, '##.00') Edit column with formula

D trans
RawWrangle true step set col:stDevs value:NUMFORMAT(stDevs, '##.00') Columns stDevs Formula NUMFORMAT(stDevs, '##.00') Edit column with formula

D trans
RawWrangle true 'modeScore' step derive type:single value:MODE(Score) as:'modeScore' Formula type Single row formula Formula MODE(Score) New column name New formula

Results:

Anna8485826899

87.00000000000001

9.330.21C
Ben718582689987.000000000000019.33-1.18D
Caleb768582689987.000000000000019.33-0.64C
Danielle878582689987.000000000000019.330.54B
Evan858582689987.000000000000019.330.32C
Faith928582689987.000000000000019.331.07A
Gabe858582689987.000000000000019.330.32C
Hannah998582689987.000000000000019.331.82A
Ian738582689987.000000000000019.33-0.96C
Jane688582689987.000000000000019.33-1.50D