Page tree

Trifacta Dataprep


Contents:

 

Contents:


At the flow level, you can define flow parameters to reference in your recipes. A flow parameter is a variable that is assigned a String value. 

NOTE: Flow parameters apply to recipe steps only.

  • To flow parameters and parameters of other types, you can apply override values at the flow level through the same interface. Details are below.
  • For more information on flow parameters, see Overview of Parameterization.

Limitations

  • Flow parameters are of String data type.

    Tip: You can wrap flow parameter references in your transformations with one of the PARSE functions. See "Examples" below.

  • Flow parameters are converted to constants in macros. Use of the macro in other recipes results in the constant value being applied.

Limitations on usage

A flow parameter cannot be used in the following transformation steps or fields.

Transformations:

  • Rename columns: Cannot use a flow parameter as a new column name.

Transformation fields:

  • The as clause when creating a New formula transformation. 

Create Parameter

Steps:

  1. Open the flow where you wish to apply the flow parameter. 
  2. From the Flow View context menu, select Parameters.
  3. In the Manage Parameters dialog, click the Parameters tab. 
  4. Click Add parameter.
  5. Enter a Name for your parameter.

    NOTE: Name values are case-sensitive. After saving a flow parameter, its name cannot be changed.

  6. Enter a default value for this parameter.

    NOTE: Input Values are evaluated as String type.

  7. Click Save.

The parameter is available for use in any recipe in your flow. See "Use Parameter."

Parameter Names

Parameter names can contain alphanumeric characters and spaces. in the following table, you can see how parameter names must be referenced in recipe steps. 

Parameter nameValid referencesNotes
paramRegion
$paramRegion
${paramRegion}
Both references are valid.
param Region
${param Region}

NOTE: If the parameter name contains a space, the curly brackets are required. As a matter of habit, you might want to use the curly brackets for all parameter references. This syntax also helps to distinguish your named parameters from metadata references, which are fixed. See Source Metadata References.

Apply Parameter Override

NOTE: Parameter overrides that were defined in a pre-Release 7.1 version of the software now appear in the Overrides tab.

You can apply overrides to all parameter types, including flow parameters, at the flow level. An overridden value applies to all references of the parameter within the flow.

NOTE: You can apply override values for any parameter of any type that is referenced in the flow: dataset parameters, flow parameters, and object parameters.

  • Upstream parameter values: Parameter values can be inherited from upstream recipes and datasets. 

    NOTE: Override values applied in a downstream flow are applied to the upstream flow when its objects are invoked for purposes of generating data for use in the downstream flow.

  •  Downstream parameter values: Downstream flows receive parameter values, default or overridden, from upstream flows. These values can be overridden at the flow level.

Steps:

  1. Open the flow where you wish to apply the flow parameter. 
  2. From the Flow View context menu, select Manage parameters.....
  3. In the Manage Parameters dialog, click the Overrides tab. 
  4. Click Add override.
  5. Select the parameter to override from the drop-down list.

  6. Set the override value for this flow. Click Save.
  7. Click Save.

This override value is applied to all references to the parameter in the flow. 

Tip: Overrides can also be applied to the recipe parameters that are included when flow tasks are executed as part of a plan. For more information, see Manage Parameters Dialog.

Override Evaluation

Override values can be applied in multiple locations. Parameter values are evaluated in the following order of precedence (highest to lowest):

  1. Overrides at run-time in the Run Job page.
  2. Overrides at the flow level.
  3. Default values for the flow. 
  4. Inherited values from upstream flows.

For more information, see Overview of Parameterization.

Use Parameter

In your recipe step, you can add references to your flow parameter in the following format:

${MyRecipeParameter}

In a recipe, flow parameters can be applied to:

  • Function parameters
  • Replacements for String values

Examples

Below are examples of how to use flow parameters.

NOTE: When a parameter value is displayed in a column, the column type in the data grid may be correctly inferring the type to your desired data type. However, the underlying type is still String type. To convert the underlying type, you must use one of the PARSE functions on your String values.

Example - String parameter

In this example, data is segmented by time zone. You must create a parameter to capture the following U.S. time zones, which must be specified explicitly:

'Hawaii'
'Alaska'
'Pacific'
'Mountain'
'Central'
'Eastern'

In your flow, you create the following flow parameter:

SettingValueNotes
NameparamTimeZone

Tip: It's a good habit to specify named variables in an identifiable way. By adding the param prefix, you identify references to it as a parameter. If you change the name to param-recipeTimeZone or similar to distinguish it as a flow parameter, then overrides specified at the flow level do not apply to any other parameter types that are performing the same function in the data.

Value##UNSPECIFIED##

Since this value must be specified explicitly, you set this value as thee default value. If this value appears in the generated output, then the flow parameter was not specified when the job was run.

NOTE: Before you begin working with this parameter in your dataset, you should consider setting an override for it to a valid value.

In the following transformation, the parameter value is inserted into a new column, paramTZ in your dataset:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula ${paramTimeZone}
Parameter: New column name 'paramTZ'

You can also use the parameter as an input to a function. In the following example, the paramTimeZone parameter is merged with the values in the Store_Nbr to compute primary key storeId field:

NOTE: You cannot use the Merge transformation column for the following transformation, since it requires named columns as inputs.

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula merge([$paramTimeZone,Store_Nbr], '-')
Parameter: New column name 'storeId'

Example - parameter with multiple values

Suppose you wish to create a flow parameter that contains multiple values. Typically, you must track these values through an array, such as the following containing a set of colors:

["red","white","blue","black"]

Flow parameters that are literals are String values only. As a workaround, you can define the above as a Trifacta pattern


SettingValueNotes
NamemyColors
Value
`red|white|blue|black`

Note how the value is specified using backticks (`), which are used to indicate a Trifacta pattern.

The vertical bars are delimiters to separate the values, when they are processed within the application.

Within your recipe, you can test for the presence of a parameter value. In the following transformation, a value of true is set in the new column isBlue if the value of $myColors is blue:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula MATCHES([blue], $myColors, true)
Parameter: New column name 'isBlue'

Example - Integer parameter

Instead of segmenting the data by named time zone values, suppose your data is segmented by regions, which are numeric in number. Your flow parameter definition could look like the following:

SettingValueNotes
NameparamRegionId

Note the more appropriate name.

Value0

In this case, there is no region identifier value 0. You choose to set the default to a value that is valid for the target data type (Integer) but is invalid for the scope of the data itself.

To use this flow parameter as an integer, you must reference it wrapped in the PARSEINT function, which evaluates the input value against the Integer data type:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula PARSEINT(${paramregionId})
Parameter: New column name paramRegionId

In the column histogram for the paramRegionId column, you can verify that the value 0 is present. Set an override outside at the flow level to insert a different value in the column. 

For more information, see PARSEINT Function.

Example - Date parameter

Suppose you need to be able to pass a date into the execution of a recipe. If no date is passed in, then the current time is used. The variable is declared as follows:

Instead of segmenting the data by named time zone values, suppose your data is segmented by regions, which are numeric in number. Your flow parameter definition could look like the following:

SettingValueNotes
NameparamDate

Note the more appropriate name.

Value

In this case, the value is left empty to be overridden as needed in the application with the current timestamp.

You should decide on the expected values for this parameter, as you must apply them to:

  • Parameter overrides
  • Recipe steps (e.g PARSEDATE function parameters)

It may be easier to insert the format string here as the default value. For example:

yyyy-mm-dd HH:MM:SS

You can use the following to insert the parameter value into your dataset. Note that the value is initially inserted as a String value, so the PARSEDATE function is used as a wrapper:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula PARSEDATE(${paramDate},['yyyy-mm-dd HH:MM:SS'])
Parameter: New column name paramDate

For more information, see PARSEDATE Function.

If the inserted value is empty or null, you can insert the current timestamp:

Tip: You could also overwrite invalid values in the following manner. However, that may mask problems with your inserted values.

Transformation Name Edit column with formula
Parameter: Columns execDate
Parameter: Formula IF((execDate == '') || ISNULL(execDate), NOW('UTC'), execDate)

In the above, the value in execDate is tested to see if it is either:

  • empty
  • null

If so, the output of the NOW function is written. By default, this function returns the timestamp value at UTC time. 

If there is a valid value, then it is written back to the column.

See NOW Function.

You can use the following to extract the time value from the parsed date param:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula DATEFORMAT(execDate, 'HH:MM:SS')
Parameter: New column name Time

Since this value is not the parameter value specifically, the column name was listed simply as Time.

Apply Parameter Override via API

When you run a job via the APIs, you can apply parameter overrides to the following parameter types:

  • dataset parameters
  • output parmeters
  • flow parameters

For more information, see API Workflow - Run Job.

This page has no comments.