Date: Tue, 25 Jan 2022 19:52:32 +0000 (GMT) Message-ID: <2062672250.107578.1643140352230@9c5033e110b2> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_107577_1017573689.1643140352230" ------=_Part_107577_1017573689.1643140352230 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html VARIF Function

VARIF Function

Contents:

=20

=20
=20
=20

=20
=20

Generates the variance of values by group in a column that meet a sp= ecific condition.

NOTE: When added to a transform, this function is appli= ed to the current sample. If you change your sample or run the job, the com= puted values for this function are updated. Transforms that change the numb= er of rows in subsequent recipe steps do not affect the values computed for= this step.

Terms...
=20

Relevant terms:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
TermDescription
PopulationPopulation statistical functions are computed fr= om all possible values. See https://en.wikip= edia.org/wiki/Statistical_population.
Sample=20
=20

Sample-based statistical functions are computed from a subset or sample = of all values. See https://en.wikipedia.org/w= iki/Sampling_(statistics).

=20

These function names include SAMP in their name.

=20
=20

NOTE: Statistical sampling has no relationship to the = samples taken within the product. When statistical functions are computed d= uring job execution, they are applied across the entire dataset. Sample met= hod calculations are computed at that time.

=20
=20
=20

• This function is calculated across the entire population.

• For more information on a sampled version of this function, see VARSAMPIF Function.

Wrangle vs. SQL: This function is part of Wrangle , a proprietary data transformation language.= Wrangle is not SQL. For more information, se= e Wrangle Language.

Basic Usage

=20

varif(testScores, ((testScores > 0) &am= p;& (testScores < 90))

Ou= tput: Returns the variance of the testScores column when the testScores value is between 0 and 90.

Syntax= and Arguments

=20

varif(col_ref, test_expression) [group:gro= up_col_ref] [limit:limit_count]

Argument Required? Data Type Description
col_ref Y string Reference to the column you wish t= o evaluate.
test_expression Y string Expression that is evaluated. Must resolve to true or false

For more information on syntax standards, see Language Documentation Syntax Notes<= /a>.

col_ref

Name of the column whose values you wish to use in the calculation. Colu= mn must be a numeric (Integer or Decimal) type.

Usage Notes:

Required? Data Type Example Value
Yes String that corresponds to the name of the colum= n myValues

test_expression

This parameter contains the expression to evaluate. This expression must= resolve to a Boolean (true or false) v= alue.

Usage Notes:

Required? Data Type Example Value
Yes String expression that evaluates to true or false (LastName =3D=3D 'Mouse' && FirstN= ame =3D=3D 'Mickey')

Examples

Example - = Conditional Calculation Functions

This example illustrates how to use the conditional calculation function= s.

Functions:

=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Item Description
AVERAGEIF Function Generates the average value of rows in each gro= up that meet a specific condition. Generated value is of Decimal type. = ;
MINIF Function Generates the minimum value of rows in each gro= up that meet a specific condition. Inputs can be Integer, Decimal, or Datet= ime.
MAXIF Function Generates the maximum value of rows in each gro= up that meet a specific condition. Inputs can be Integer, Decimal, or= Datetime.
VARIF Function Generates the variance of values by group in a = column that meet a specific condition.
STDEVIF Function Generates the standard deviation of values by g= roup in a column that meet a specific condition.

Source:

Here is some example weather data:

=20 =20 =20 =20 =20 =20
date city rain temp wind
1/23/17 Valleyville 0.00 12.8 6.7
1/23/17 Center Town 0.31 9.4 5.3
1/23/17 Magic Mountain 0.00 0.0 7.3
1/24/17 Valleyville 0.25 17.2 3.3
1/24/17 Center Town 0.54 1.1 7.6
1/24/17 Magic Mountain 0.32 5.0 8.8
1/25/17 Valleyville 0.02 3.3 6.8
1/25/17 Center Town 0.83 3.3 5.1
1/25/17 Magic Mountain 0.59 -1.7 6.4
1/26/17 Valleyville 1.08 15.0 4.2
1/26/17 Center Town 0.96 6.1 7.6
1/26/17 Magic Mountain 0.77 -3.9 3.0
1/27/17 Valleyville 1.00 7.2 2.8
1/27/17 Center Town 1.32 20.0 0.2
1/27/17 Magic Mountain 0.77 5.6 5.2
1/28/17 Valleyville 0.12 -6.1 5.1
1/28/17 Center Town 0.14 5.0 4.9
1/28/17 Magic Mountain 1.50 1.1 0.4
1/29/17 Valleyville 0.36 13.3 7.3
1/29/17 Center Town 0.75 6.1 9.0
1/29/17 Magic Mountain 0.60 3.3 6.0

Transformation:

The following computes average temperature for rainy days by city:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 = =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Single row formula AVERAGEIF(temp, rain > 0) city 'avgTempWRain'
=20

The following computes maximum wind for sub-zero days by city:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Single row formula MAXIF(wind,temp < 0) city 'maxWindSubZero'
=20

This step calculates the minimum temp when the wind is less than 5 mph b= y city:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Single row formula MINIF(temp,wind<5) city 'minTempWind5'
=20

This step computes the variance in temperature for rainy days by city:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Single row formula VARIF(temp,rain >0) city 'varTempWRain'
=20

The following computes the standard deviation in rainfall for Center Tow= n:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name <= code>New formula Single row formula STDEVIF(rain,city=3D=3D'Center Town') city 'stDevRainCT'
=20

You can use the following transforms to format the generated output. Not= e the \$col placeholder value for the multi-column transforms:<= /p>

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name Edit column with formula stDevRainCenterTown,maxWindSubZero<= /td>=20 numformat(\$col,'##.##')
=20

Since the following rely on data that has only one significant digit, yo= u should format them differently:

=20
=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
Transformation Name Edit column with formula varTempWRain,avgTempWRain,minTempWind5 numformat(\$col,'##.#')
=20

Results:

=20 =20 =20 =20 =20 =20 =20 =20 =20 =20 =20
date city rain temp wind avgTempWRain maxWindSubZero minTempWind5 varTempWRain stDevRainCT
1/23/17 Valleyville 0.00 12.8 6.7 8.3 5.1 7.2 63.8 0.37
1/23/17 Center Town 0.31 9.4 5.3 7.3   5 32.6 0.37
1/23/17 Magic Mountain 0.00 0.0 7.3 1.6 6.43 -3.9 12 0.37
1/24/17 Valleyville 0.25 17.2 3.3 8.3 5.1 7.2 63.8 0.37
1/24/17 Center Town 0.54 1.1 7.6 7.3   5 32.6 0.37
1/24/17 Magic Mountain 0.32 5.0 8.8 1.6 6.43 -3.9 12 0.37
1/25/17 Valleyville 0.02 3.3 6.8 8.3 5.1 7.2 63.8 0.37
1/25/17 Center Town 0.83 3.3 5.1 7.3   5 32.6 0.37
1/25/17 Magic Mountain 0.59 -1.7 6.4 1.6 6.43 -3.9 12 0.37
1/26/17 Valleyville 1.08 15.0 4.2 8.3 5.1 7.2 63.8 0.37
1/26/17 Center Town 0.96 6.1 7.6 7.3   5 32.6 0.37
1/26/17 Magic Mountain 0.77 -3.9 3.0 1.6 6.43 -3.9 12 0.37
1/27/17 Valleyville 1.00 7.2 2.8 8.3 5.1 7.2 63.8 0.37
1/27/17 Center Town 1.32 20.0 0.2 7.3   5 32.6 0.37
1/27/17 Magic Mountain 0.77 5.6 5.2 1.6 6.43 -3.9 12 0.37
1/28/17 Valleyville 0.12 -6.1 5.1 8.3 5.1 7.2 63.8 0.37
1/28/17 Center Town 0.14 5.0 4.9 7.3   5 32.6 0.37
1/28/17 Magic Mountain 1.50 1.1 0.4 1.6 6.43 -3.9 12 0.37
1/29/17 Valleyville 0.36 13.3 7.3 8.3 5.1 7.2 63.8 0.37
1/29/17 Center Town 0.75 6.1 9.0 7.3   5 32.6 0.37
1/29/17 Magic Mountain 0.60 3.3 6.0 1.6 6.43 -3.9 12 0.37

=20
=20

=20

=20

=20
=20
=20

=20
=20