Page tree


Contents:

   

Contents:


NOTE: This feature is available for Dataprep by Trifacta editions only.

Feature Availability: This feature may not be available in all product editions. For more information on available features, see Compare Editions.

This section describes how to create and deploy JavaScript-based user-defined functions (UDFs) for use in your recipes in your Dataprep by Trifacta project.

Wrangle provides a wide range of functions to transform your data across all supported data types. However, you may need to be able to define functions to meet your specific use cases.

Examples:

  • Your enterprise has a specific method for calculating asset deprecation, which must be applied consistently.
  • You use specialized statistical calculations for managing risk. 
  • Your industry has commonly used metrics that are not available in the Dataprep by Trifacta application.

To support more specific use cases, you can create functions in JavaScript and make them available for use within the recipes that you create in the Dataprep by Trifacta application.

Features:

  • Easy to create and modify directly through the Dataprep by Trifacta application.
  • JavaScript UDFs are available through the Search panel and can be referenced in formulas like any other  Wrangle function.
  • Executable for both Trifacta Photon and Dataflow jobs
  • Internationalization is supported through built-in Java libraries. For more information, see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl.

Prerequisites

Creation of UDFs requires JavaScript development experience.

Required access levels

Tip: By default, UDFs are shared with all users in the project. Users that do not have access privileges on UDFs can still use the functions in Wrangle.

Access level for JavaScript UDF feature and UDF code is determined by the User defined functions privilege. 

Tip: The Default role is automatically assigned the Author level of access for the User defined functions privilege. This privilege level is sufficient for creating and editing JavaScript UDFs.

Tip: To create UDFs, you must have the Author level for your User defined functions privilege in one of your account roles. The default role provides Author level access, by default.

For more information, see  Overview of Authorization

For more information, see Privileges and Roles Reference.

Enable

Feature Availability: This feature may not be available in all product editions. For more information on available features, see Compare Editions.

This feature is enabled by default. See Dataprep Project Settings Page.

Limitations

NOTE:  If execution of the UDF fails on one or more rows, the values in the affected rows are empty, and the job is permitted to continue execution.

The following limitations apply to JavaScript UDFs:

  • JavaScript UDFs execute function calls for each row of data. 
    • UDFs cannot guarantee maintaining global variable state across an entire sorted dataset that is being processed.
    • Aggregate functions across multiple rows are not supported.
  • Each UDF can add a new column or edit an existing column's values. Other types of data transformations are not supported.
  • Functions cannot be implemented to make external calls over the network. External API calls are not supported.
    • JavaScript libraries cannot be imported for use in creating your UDFs.
  • UDFs cannot be used in macros.
  • UDFs cannot be defined to replace functions in  Wrangle
  • You cannot reference  Wrangle functions from within your UDF code.

    Tip: Wrangle references can be used as inputs to a UDF or in a Formula of which your UDF is a part. Details are below.

  • Export and import of JavaScript UDFs is not supported. 

    Tip: JavaScript UDFs can be imported into another project or workspace when they are referenced in a flow that you are importing. See "Importing through Flows" below.

  • If assets are transferred to a new user via API, UDFs are not included in the transfer.

Supported JavaScript

JavaScript implementation

  • Supports only Javascript ES6 version. For more information, see https://www.w3schools.com/js/js_es6.asp.
  • JavaScript V8 libraries are used internally. For more information on licensing, see https://github.com/v8/v8/blob/master/LICENSE.
  • Memory and stack depth limits may impact the number of nested loops or recursive calls that can be executed within a single function.
  • HTTP request objects cannot be referenced from within your JavaScript.
  • JavaScript definitions are not stored in encrypted format.
  • The export keyword can be used for the following definitions only:
    • The UDF signature constant:

      export const signature


    • The main function:

      export function trifactaUdf()

      NOTE: Use of the export keyword in other areas of your code cause failures if the UDF is executed as part of a BigQuery pushdown transformation. Other uses of export are not supported.


Supported JavaScript objects

Fundamental objects:

  • Object

  • Function

  • Boolean
  • Symbol

Error handling:

  • Error
  • AggregateError
  • EvalError
  • InternalError
  • RangeError
  • ReferenceError
  • SyntaxError
  • TypeError
  • URIError

Data types:

  • Number
  • BigInt
  • Math
  • Date
  • String
  • RegExp
  • Array

Array types:

  • Int8Array
  • Uint8Array
  • Uint8ClampedArray
  • Int16Array
  • Uint16Array
  • Int32Array
  • Uint32Array
  • Float32Array
  • Float64Array
  • BigInt64Array
  • BigUint64Array

Other supported objects and collections:

  • Map

  • Set
  • WeakMap
  • WeakSet
  • JSON
  • ArrayBuffer
  • DataView
  • Promise
  • Generator
  • GeneratorFunction

Supported JavaScript value properties

  • undefined

  • NaN

  • Infinity
  • isFinite
  • isNaN

Supported JavaScript functions

Parsing functions:

  • parseFloat
  • parseInt

URI encoding functions:

  • encodeURI
  • encodeURIComponent
  • decodeURI
  • decodeURIComponent

Internationalization functions:

  • Intl
  • Intl.Collator
  • Intl.DateTimeFormat
  • Intl.ListFormat
  • Intl.NumberFormat
  • Intl.PluralRules
  • Intl.RelativeTimeFormat
  • Intl.Locale

Type Conversions

The following tables indicate how Dataprep by Trifacta data types are converted to JavaScript data types during input and also for output back to Dataprep by Trifacta data types.

For more information on the string values for Dataprep by Trifacta data types, see Valid Data Type Strings.

Input type conversions

Dataprep by Trifacta Data Type

JavaScript UDF Data TypeNotes
StringJS String
IntegerJS NumbersJavaScript has no data type for integers. All numeric values are interpreted as JS Numbers.
DecimalJS Numbers
Boolean1/0
Social Security NumberJS String
Phone NumberJS String
Email AddressJS String
Credit CardJS String
GenderJS String
ObjectJS Objects
ArrayJS Arrays
IP AddressJS String
URLJS String
HTTP CodeJS String
Zip CodeJS String
StateJS String
DatetimeJS String

Output data types

On output, JavaScript data types are exported to their corresponding Dataprep by Trifacta data types, with the following specific mappings: 

Tip: The fallback data type is String. Exported String values may require conversion to other Dataprep by Trifacta data types.

JavaScript UDF Data Type

Dataprep by Trifacta Data Type

Notes
JS StringString
JS Numbers

Integer

Decimal

JavaScript returns values as JS Numbers. Dataprep by Trifacta application interprets if Integer or Decimal.

1/0Boolean
JS ObjectsObject
JS ArraysArray

Security Measures

To minimize the risk of code execution on the platform, the following security measures have been applied:

  • Some language constructs like eval are not supported.
  • During design time, web workers are used.

  • During runtime (job execution): 
    • v8 Isolates are deployed for execution. For more information, see  https://v8.dev/docs/untrusted-code-mitigations.

    • Jobs are executed in sandboxed compute resources for the following running environments:
      • Trifacta Photon: remote Kubernetes job
      • Dataflow: customer VPC
      • Users that are not executing the job are isolated from job data and Dataprep by Trifacta metadata through UDFs.
  • JavaScript UDFs have execution time limits placed on them to prevent runaway processes. 
  • JavaScript UDFs have other security related restrictions to prevent malicious code injection.
  • Sensitive APIs of the Dataprep by Trifacta Cloud are not exposed in the global object.

Enable

For more information on enabling this feature in your project, please contact  Google Support.

UDF Requirements

The following sections define the requirements for creating your UDF.

Metadata

The following metadata objects must be defined as part of your JavaScript UDF:

ObjectDescription
Name

Name of your JavaScript function.

  • Only alphanumeric characters and underscores (_) are supported. No special characters.
  • Function names must be unique within the project or workspace.

Tip: All UDF function names are prepended with UDF., which makes them easier to discover in the Transform Builder. Do not prepend this value or similar here.

DescriptionUser-friendly description of the function. This description appears in the Transform Builder and should be kept to a single line of text.
SignatureThe function's signature defines the inputs that it accepts. See below for more information.
Function definitionThis section defines the custom JavaScript code that is executed when the function is invoked.

Signature

Changing the signature of a JavaScript UDF that is already in use may break recipe steps and may cause jobs to fail. For example, if you add required inputs to the function, recipe steps become broken. When you make changes to the function's signature, you should review the recipes where the function is in use.

The signature defines the inputs of the function. A JavaScript UDF signature has the following generic structure. 

Tip: The function signature and code must be supplied in the same entry in the Dataprep by Trifacta application. They are separated here for clarification purposes.

NOTE: Avoid exporting any JavaScript objects other than the signature and function definition.

export const signature = {
  "inputs": [   
   {
      "alias": "Input1",
      "description": "Input1's description",
      "displayType": "<DataType>",
      "examples": [
        "COL1"
      ]
    },
    {
      "alias": "Input2",
      "description": "Input2's description",
      "displayType": "<DataType>",
      "examples": [
        "COL2"
      ]
    }
  ],
  "output": {
    "type" : "String"
  }
};

Notes:

  • The signature and function definition must be marked for export in the JavaScript module. The following line must be present at the start of the signature:

    export const signature = {

    NOTE: Use of the export keyword outside of the UDF signature constant or main function declaration cause failures if the UDF is executed as part of a BigQuery pushdown transformation. Other uses of export are not supported.

  • Individual inputs are specified as elements of an array. Each array element can contain the following attribute and values:

    AttributeDescription
    alias(Required) Unique name of the input. This value can be referenced within the function.
    descriptionUser-friendly description of the input. Appears in the popup that is displayed when the function has been selected.
    displayType

    Text value representing the data type of the input. This value is for display purposes only. Appears in the function popup.

    examplesText value or values to indicate the type of input. Appears in the function popup.
  • The number of inputs in the signature array must equal the number of inputs in the main function.
  • The output of the function is specified according to the Dataprep by Trifacta data type. For more information on the values to insert for type, see Valid Data Type Strings.

Function definition

The definition of your function has the following basic structure:

Tip: The function signature and code must be supplied in the same entry in the Dataprep by Trifacta application. They are separated here for clarification purposes.

NOTE: Avoid exporting any JavaScript objects other than the signature and function definition.

export function trifactaUdf(<alias1>,<alias2>) {
   <your_JavaScript>
   }
   return <values_returned>;
}

Notes:

  • The signature and function definition must be marked for export in the JavaScript module. The following line must be present at the start of the signature:

    export function trifactaUdf(<any_inputs>) {
    
    • In the above, the inputs to the function must match in name the values for each alias in the signature. 

  • The values returned from the function are specified as part of the return entry.

Type inference

JavaScript is a dynamically typed language. As a result, actual data types are assigned to inputs based on type inferencing performed by the Dataprep by Trifacta application during design time.

Flow parameters are always passed into the function for execution as String values and must explicitly cast to other data types within the function.

Tip: When using a function that returns String values, you can wrap the function in a parsing function like PARSEINT or PARSEDATE to convert to other data types.

Other requirements

Name scoping

Internally, all UDFs are scoped in name using the convention:

UDF.<yourFunctionName>

This naming convention prevents name collision with existing  Wrangle functions.

Syntax

Please use Javascript ES6 syntax. For more information, see https://www.w3schools.com/js/js_es6.asp.

Commenting

You can insert comments in your UDF code. The following is an example:

// This is a comment.

Examples

Example - Net Present Value

This example computes net present value.

export const signature = {
  "inputs": [   
   {
      "alias": "R",
      "description": "Cash flow at time t",
      "displayType": "Integer",
      "examples": [
        "COL1"
      ]
    },
   {
      "alias": "i",
      "description": "Discount rate as a decimal value",
      "displayType": "Float",
      "examples": [
        "COL2"
      ]
    },
    {
      "alias": "t",
      "description": "Number of time units",
      "displayType": "Float",
      "examples": [
        "COL3"
      ]
    }
  ],
  "output": {
    "type" : "Float"
  }
};


export function trifactaUdf(R, i, t) {
   let discount = Math.pow(1 + i, t);  
   return (R / discount);
}

Other examples

For additional examples, please see https://community.trifacta.com/s/article/JavaScript-UDFs-Examples.

Deploy Your UDF

User Defined Functions page

When you have created your UDF in JavaScript, you can upload it to your project in the User Defined Functions page. Select Library > User defined functions. 

See User Defined Functions Page.

Importing through flows

JavaScripts UDFs cannot be separately exported and imported from the Dataprep by Trifacta application. However, when you export a flow definition that contains references to a JavaScript UDF, the function definition is also exported.

When this flow is imported into a new project or workspace:

  • If no UDF exists with same name in the target system, then a new UDF is created.
  • If a UDF already exists with same name, then you can choose one of the following: 
    • Use the existing UDF in the new project or workspace,
    • Overwrite the existing UDF with the UDF in the flow import.

      NOTE: You must have Editor or better privileges on the existing UDF to overwrite it during import.

Job execution

Javascript UDFs are executed as part of jobs on Dataflow and Trifacta Photon

NOTE: User-defined functions can be pushed down to BigQuery during job execution. This optimization must be enabled for each flow. For more information, see Flow Optimization Settings Dialog.

Error handling

  • When errors are encountered using JavaScript UDFs:
    • Transformer Page: errors in the UDF code are reported on the page. The code must be fixed to eliminate these errors for the sampled rows.
    • Job execution: null values are written as the output
  • All errors encountered in executing a UDF using JavaScript v8 are collected and submitted to the Dataprep by Trifacta job logs.

Storage

  • Signature information is stored in a dedicated table in the Dataprep by Trifacta database.
  • UDF definitions are stored separately as artifacts using the Artifact Storage Service.

Use

Transform Builder

In the Transform Builder:

  • You can search for your UDF by name or by entering UDF.
  • In a Formula field, you can reference the function by name or by entering UDF.  

UDFs and  Wrangle

You cannot reference   Wrangleobjects from within your UDFs. 

  • When specifying your UDFs in the Transform Builder, you can pass in parameters as inputs.
  • You can combine UDFs with  Wrangle functions as part of your Formula.

See Also for JavaScript UDFs:

This page has no comments.