Page tree

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »


Contents:

NOTE:  Designer Cloud Educational is a free product with limitations on its features. Some features in the documentation do not apply to this product edition. See Product Limitations.

   

Computes the correlation coefficient between two columns. Source values can be of Integer or Decimal type.

The correlation coefficient measures the relationship between two sets of values. You can use it as a measurement for how changes in one value affect changes in the other. 

  • Values range between -1 (negative correlation) and +1 (positive correlation). 
    • Negative correlation means that the second number tends to decrease when the first number increases. 
    • Positive correlation means that the second number tends to increase when the first number increases.
    • A correlation coefficient that is close to 0 indicates a weak or non-existent correlation.

Relevant terms:

Term Description
Population Population statistical functions are computed from all possible values. See https://en.wikipedia.org/wiki/Statistical_population.
Sample

Sample-based statistical functions are computed from a subset or sample of all values. See https://en.wikipedia.org/wiki/Sampling_(statistics).

These function names include SAMP in their name.

NOTE: Statistical sampling has no relationship to the samples taken within the product. When statistical functions are computed during job execution, they are applied across the entire dataset. Sample method calculations are computed at that time.

Basic Usage

correl(initialInvestment,ROI)

Output: Returns the correlation coefficient between the values in the initialInvestment column and the ROI column.

Syntax and Arguments

correl(function_col_ref1,<span>function_col_ref2</span>) [group:group_col_ref] [limit:limit_count]


ArgumentRequired?Data TypeDescription
function_col_ref1YstringName of column that is the first input to the function
function_col_ref2YstringName of column that is the second input to the function

For more information on the group and limit parameters, see Pivot Transform.

For more information on syntax standards, see Language Documentation Syntax Notes.

function_col_ref1, function_col_ref2

Name of the column the values of which you want to calculate the correlation. Column must contain Integer or Decimal values.

  • Literal values are not supported as inputs.
  • Multiple columns and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesString (column reference)myInputs


Examples


Tip: For additional examples, see Common Tasks.

This example illustrates statistical functions that can be applied across two columns of values.

Functions:

ItemDescription
CORREL Function Computes the correlation coefficient between two columns. Source values can be of Integer or Decimal type.
COVAR Function Computes the covariance between two columns using the population method. Source values can be of Integer or Decimal type.
COVARSAMP Function Computes the covariance between two columns using the sample method. Source values can be of Integer or Decimal type.
ROUND Function Rounds input value to the nearest integer. Input can be an Integer, a Decimal, a column reference, or an expression. Optional second argument can be used to specify the number of digits to which to round.

Source:

The following table contains height in inches and weight in pounds for a set of students.

StudentheightInweightLbs
170134
267135
367147
467160
572136
673146
771135
863145
967138
1066138
1171161
1270131
1374131
1467157
1573161
1670133
1763132
1864153
1964156
2072154


Transformation:

You can use the following transformations to calculate the correlation co-efficient, the covariance, and the sampling method covariance between the two data columns:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula round(correl(heightIn, weightLbs), 3)
Parameter: New column name 'corrHeightAndWeight'

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula round(covar(heightIn, weightLbs), 3)
Parameter: New column name 'covarHeightAndWeight'

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula round(covarsamp(heightIn, weightLbs), 3)
Parameter: New column name 'covarHeightAndWeight-Sample'

Results:

StudentheightInweightLbscovarHeightAndWeight-SamplecovarHeightAndWeightcorrHeightAndWeight
170134-2.876-2.732-0.074
267135-2.876-2.732-0.074
367147-2.876-2.732-0.074
467160-2.876-2.732-0.074
572136-2.876-2.732-0.074
673146-2.876-2.732-0.074
771135-2.876-2.732-0.074
863145-2.876-2.732-0.074
967138-2.876-2.732-0.074
1066138-2.876-2.732-0.074
1171161-2.876-2.732-0.074
1270131-2.876-2.732-0.074
1374131-2.876-2.732-0.074
1467157-2.876-2.732-0.074
1573161-2.876-2.732-0.074
1670133-2.876-2.732-0.074
1763132-2.876-2.732-0.074
1864153-2.876-2.732-0.074
1964156-2.876-2.732-0.074
2072154-2.876-2.732-0.074

See Also for EXAMPLE - Two-Column Statistical Functions:

See Also for CORREL Function:

Error rendering macro 'contentbylabel'

parameters should not be empty

  • No labels

This page has no comments.