Date: Tue, 25 Jan 2022 22:08:40 +0000 (GMT) Message-ID: <2129572506.107968.1643148520077@9c5033e110b2> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_107967_1804779269.1643148520077" ------=_Part_107967_1804779269.1643148520077 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html VARSAMPIF Function

VARSAMPIF Function

Contents:

=20

=20
=20
=20

=20
=20

Generates the variance of values by group in a column that meet a sp= ecific condition=20 using the sample statistical method.

NOTE: When added to a transform, this function is appli= ed to the current sample. If you change your sample or run the job, the com= puted values for this function are updated. Transforms that change the numb= er of rows in subsequent recipe steps do not affect the values computed for= this step.

NOTE: This function applies to a sample of the entire p= opulation. More information is below.

Terms...
=20

Relevant terms:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
TermDescription
PopulationPopulation statistical functions are computed fr= om all possible values. See https://en.wikip= edia.org/wiki/Statistical_population.
Sample=20
=20

Sample-based statistical functions are computed from a subset or sample = of all values. See https://en.wikipedia.org/w= iki/Sampling_(statistics).

=20

These function names include SAMP in their name.

=20
=20

NOTE: Statistical sampling has no relationship to the = samples taken within the product. When statistical functions are computed d= uring job execution, they are applied across the entire dataset. Sample met= hod calculations are computed at that time.

=20
=20
=20

• This function is calculated across a sample of all values.

• For more information on a population version of this function, see = ;VARIF Function.

Wrangle vs. SQL: This function is part of Wrangle , a proprietary data transformation language.= Wrangle is not SQL. For more information, se= e Wrangle Language.

Basic Usage=

=20

varsampif(testScores, ((testScores > 0)= && (testScores < 90)))

Ou= tput: Returns the variance of the testScores column when the testScores value is between 0 and 90 using = the sample method of calculation.

Sy= ntax and Arguments

=20

varsampif(col_ref, test_expre= ssion) [group:group_col_ref] [limit:limit_count]

Argument Required? Data Type Description
col_ref Y string Reference to the column you wish t= o evaluate.
test_expression Y string Expression that is evaluated. Must resolve to true or false

For more information on syntax standards, see Language Documentation Syntax Notes<= /a>.

col_ref

Name of the column whose values you wish to use in the calculation. Colu= mn must be a numeric (Integer or Decimal) type.

Usage Notes:

Required? Data Type Example Value
Yes String that corresponds to the name of the colum= n myValues

test_expression

This parameter contains the expression to evaluate. This expression must= resolve to a Boolean (true or false) v= alue.

Usage Notes:

Required? Data Type Example Value
Yes String expression that evaluates to true or false (LastName =3D=3D 'Mouse' && FirstN= ame =3D=3D 'Mickey')

Examples

Exampl= e - Conditional Calculation Functions

This example shows some of the statistical functions that use the sample= method of computation.

Functions:

=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Item Description
STDEVSAMP Function Computes the standard deviation across column v= alues of Integer or Decimal type using the sample statistical method<= /span>.
VARSAMP Function Computes the variance among all values in a col= umn using the sample statistical method. Input column can be o= f Integer or Decimal. If no numeric values are detected in the input c= olumn, the function returns 0
STDEVSAMPIF Function Generates the standard deviation of values by g= roup in a column that meet a specific condition using the sample stat= istical method.
VARSAMPIF Function Generates the variance of values by group in a = column that meet a specific condition using the sample statistical me= thod.
ROUND Function Rounds input value to the nearest integer. = ;Input can be an Integer, a Decimal, a column reference, or an express= ion. Optional second argument can be used to specify the number of dig= its to which to round.

Source:

Students took tests on three consecutive Saturdays:

=20
Student Date Score
Andrew 11/9/19 81
Bella 11/9/19 84
Christina 11/9/19 79
David 11/9/19 64
Ellen 11/9/19 61
Fred 11/9/19 63
Andrew 11/16/19 73
Bella 11/16/19 88
Christina 11/16/19 78
David 11/16/19 67
Ellen 11/16/19 87
Fred 11/16/19 90
Andrew 11/23/19 76
Bella 11/23/19 93
Christina 11/23/19 81
David 11/23/19 97
Ellen 11/23/19 97
Fred 11/23/19 91

Transformation:

You can use the following transformations to calculate standard deviatio= n and variance across all dates using the sample method. Each computation h= as been rounded to three digits.

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Single row formula round(stdevsamp(Score), 3) 'stdevSamp'
=20

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Single row formula round(varsamp(Score), 3) 'varSamp'
=20

You can use the following to limit the previous statistical computations= to the last two Saturdays of testing:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Single row formula round(stdevsampif(Score, Date !=3D '11\/9\= /2019'), 3) 'stdevSampIf'
=20

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Single row formula round(varsampif(Score, Date !=3D '11\/9\/2= 019'), 3) 'varSampIf'
=20

Results:

Student Date Score varSampIf stdevSampIf varSamp stdevSamp
Andrew 11/9/19 81 94.515 9.722 131.673 11.475
Bella 11/9/19 84 94.515 9.722 131.673 11.475
Christina 11/9/19 79 94.515 9.722 131.673 11.475
David 11/9/19 64 94.515 9.722 131.673 11.475
Ellen 11/9/19 61 94.515 9.722 131.673 11.475
Fred 11/9/19 63 94.515 9.722 131.673 11.475
Andrew 11/16/19 73 94.515 9.722 131.673 11.475
Bella 11/16/19 88 94.515 9.722 131.673 11.475
Christina 11/16/19 78 94.515 9.722 131.673 11.475
David 11/16/19 67 94.515 9.722 131.673 11.475
Ellen 11/16/19 87 94.515 9.722 131.673 11.475
Fred 11/16/19 90 94.515 9.722 131.673 11.475
Andrew 11/23/19 76 94.515 9.722 131.673 11.475
Bella 11/23/19 93 94.515 9.722 131.673 11.475
Christina 11/23/19 81 94.515 9.722 131.673 11.475
David 11/23/19 97 94.515 9.722 131.673 11.475
Ellen 11/23/19 97 94.515 9.722 131.673 11.475
Fred 11/23/19 91 94.515 9.722 131.673 11.475

=20
=20

=20

=20

=20
=20
=20

=20
=20