Page tree

Release 8.2



Contents:

   

Contents:


Release 8.2

June 11, 2021

What's New

Preferences:

  • Re-organized user account, preferences, and storage settings to streamline the setup process. See Preferences Page.

API:

Connectivity:


Databricks:

Support for Databricks 7.3, using Spark 3.0.1.

NOTE: Databricks 5.5 LTS is scheduled for end of life in July 2021. An upgrade to Databricks 7.3 is recommended.

NOTE: In this release, Spark 3.0.1 is supported for use with Databricks 7.3 only.



Plan metadata references:

Use metadata values from other tasks and from the plan itself in your HTTP task definitions.


Improved accessibility of job results:

The Jobs tabs have been enhanced to display the list of latest and the previous jobs that have been executed for the selected output.

For more information, see View for Outputs.

Sample Jobs Page:

You can monitor the status of all sample jobs that you have generated. Project administrators can access all sample jobs in the workspace. For more information, see Sample Jobs Page.

Install:

Support for Nginx 1.20.0 on the Trifacta node. See System Requirements.

Changes in System Behavior

Java service classpath changes:

NOTE: This required update applies only to customers who have modified their Java service classpaths to include /etc/hadoop/conf.

In deployments on a Hadoop edge node, the classpath values for Java-based services may have been modified to include the following:

/etc/hadoop/conf

As of this release, symlinks must be created to locations within the Trifacta install directory to replace the above path modifications.

NOTE: Before you before the following update, please create a backup of /etc/hadoop/conf first.

In the following example, all files in the etc/hadoop/conf directory are updated with symlinks to the proper directory in the conf directory of files.

for file in `ls /etc/hadoop/conf`; do ln -sf /etc/hadoop/conf/$file /opt/trifacta/conf/hadoop-site/$file; done

Running Environment:

Cloudera 5.x, including Cloudera 5.16, is no longer supported. Please upgrade to a supported version of Cloudera 6.x.

Catalog integrations end of life:

The following catalog integrations are no longer available in the platform:

  • Alation
  • Waterline
  • Cloudera Navigator

For more information, see End of Life and Deprecated Features.

API:

The following API endpoints are scheduled for deprecation in a future release:

NOTE: Please avoid using the following endpoints.

/v4/connections/vendors
/v4/connections/credentialTypes
/v4/connections/:id/publish/info
/v4/connections/:id/import/info

These endpoints have little value for public use.

Key Fixes

TicketDescription
TD-59854Datetime column from Parquet file incorrectly inferred to the wrong data type on import.
TD-59658IAM roles passed through SAML does not update after Hotfix upgrade
TD-59633Enabled session tag feature but running into "The security token included in the request is invalid" error
TD-59331When include quotes option is disabled on an output, Databricks still places quotes around empty values.
TD-59128BOM characters at the beginning of a file causing multiple headers to appear in Transformer Page.
TD-58932Cannot read file paths with colons from EMR Spark jobs
TD-58694Very large number of files generated during Spark job execution
TD-58523Cannot import dataset with filename in Korean alphabet from HDFS.

New Known Issues

TicketDescription
TD-60701Most non-ASCII characters incorrectly represented in visual profile downloaded in PDF format.

Release 8.1

February 26, 2021

What's New

In-app messaging: Be sure to check out the new in-app messaging feature, which allows us to share new features and relevant content to  Trifacta Wrangler Enterprise users in your workspace. The user messaging feature can be disabled by workspace administrators if necessary. See Workspace Settings Page.


Install:

  • Support for PostgreSQL 12.X for Trifacta databases on all supported operating systems.

    NOTE: Beginning in this release, the latest stable release of PostgreSQL 12 can be installed with the Trifacta platform. Earlier versions of PostgreSQL 12.X can be installed manually.

    NOTE: Support for PostgreSQL 9.6 is deprecated for customer-managed Hadoop-based deployments and AWS deployments. PostgreSQL 9.6 is supported only for Azure deployments. When Azure supports PostgreSQL 12 or later, support for PostgreSQL 9.6 will be deprecated in the subsequent release of Trifacta Wrangler Enterprise.


Security:

Databases:

  • New databases:

    • The Secure Token Service database is used for managing the tokens used by the secure token service.
    • The Connector Configuration Service database stores the connection configuration information for a workspace's available connectors (connection types).
    • These databases are installed and managed in conjunction with the other Trifacta databases. See Install Databases.

Connectivity:

  • For AWS-based installations, you can create multiple read-only S3 connections through the Trifacta application. These connections use key and secret pair combinations to access specific S3 buckets. For more information, S3 Connections.



  • You can enable logging of events from the CData driver underlying your supported relational connections. For more information, see Configure Connectivity.

Authorization:

Sharing:

  • Define permissions on individual objects when they are shared.

    NOTE: Fine-grained sharing permissions apply to flows and connections only.

    For more information, see Changes to User Management.

API:
  • Apply job-level overrides to AWS Databricks or Azure Databricks job executions via API. See API Workflow - Run Job.


  • Customize connection types (connectors) to ensure consistency across all connections of the same type and to meet your enterprise requirements. For more information, see Changes to the APIs.

Running environment:



Publishing:

Macro updates:

You can replace an existing macro definition with a macro that you have exported to your local desktop.

NOTE: Before you replace the existing macro, you must export a macro to your local desktop. For more information, see Export Macro.

For more information, see Macros Page.

Sample Jobs Page:

You can monitor the status of all sample jobs that you have generated. Project administrators can access all sample jobs in the workspace. For more information, see Sample Jobs Page.

Specify column headers during import

You can specify the column headers for your dataset during import. For more information, see Import Data Page.

Services:

Changes in System Behavior

NOTE: CDH 6.1 is no longer supported. Please upgrade to the latest supported version. For more information, see Product Support Matrix.

NOTE: HDP 2.6 is no longer supported. Please upgrade to the latest supported version. For more information, see Product Support Matrix.


Support for custom data types based on dictionary files to be deprecated:

NOTE: The ability to upload dictionary files and use their contents to define custom data types is scheduled for deprecation in a future release. This feature is limited and inflexible. Until an improved feature can be released, please consider using workarounds. For more information, see Validate Your Data.

You can create custom data types using regular expressions. For more information, see Create Custom Data Types.


Strong consistency management now provided by AWS S3:

Prior to this release, S3 sometimes did not accurately report the files that had been written to it, which resulted in consistency issues between the files that were written to disk and the files that were reported back to the Trifacta application.

As of this release, AWS has improved S3 with strong consistency checking, which removes the need for the product to maintain a manifest file containing the list of files that have been written to S3 during job execution.

NOTE: As of this release, the S3 job manifest file is no longer maintained. All configuration related to this feature has been removed from the product. No additional configuration is needed.

For more information, see https://aws.amazon.com/s3/consistency/.

For more information on integration with S3, see Enable S3 Access.


Installation of database client is now required:

Before you install or upgrade the database or perform any required database cross-migrations, you must install the appropriate database client first.

NOTE: Use of the database client provided with each supported database distribution is now a required part of any installation or upgrade of the Trifacta platform.

For more information: 

Job logs collected asynchronously for Databricks jobs:

In prior releases, the Trifacta application reported that a job failed only after the job logs had been collected from the Databricks cluster. This log collection process could take a while to complete, and the job was reported as in progress when it had already failed.

Beginning in this release, collection of Databricks job logs for failed jobs happens asynchronously. Jobs are now reported in the Trifacta application as soon as they are known to have failed. Log collection happens in the background afterward.

Catalog integrations now deprecated:

Integrations between Trifacta Wrangler Enterprise and Alation and Waterline services are now deprecated. For more information, see End of Life and Deprecated Features.

Key Bug Fixes

TicketDescription
TD-56170The Test Connection button for some relational connection types does not perform a test authentication of user credentials.
TD-54440

Header sizes at intermediate nodes for JDBC queries cannot be larger than 16K.

Previously, the column names for JDBC data sources were passed as part of a header in a GET request. For very wide datasets, these GET requests often exceeded 16K in size, which represented a security risk.

The solution is to turn these GET requests into ingestion jobs.

NOTE: To mitigate this issue, JDBC ingestion and JDBC long loading must be enabled in your environment. For more information, see Configure JDBC Ingestion.


New Known Issues

TicketDescription
TD-58818

Cannot run jobs on some builds HDP 2.6.5 and later. There is a known incompatibility between HDP 2.6.5.307-2 and later and the Hadoop bundle JARs that are shipped with the Trifacta installer.

Solution: The solution is to use an earlier compatible version. For more information, see Configure for Hortonworks.

TD-58523

Cannot import dataset with filename in Korean alphabet from HDFS.

Workaround: You can upload files with Korean characters from your desktop. You can also add a 1 to the end of the file on HDFS, and it can then be imported.

TD-55299

Imported datasets with encodings other than UTF-8 and line delimiters other than \n may generate empty outputs on Spark or Dataflow running environments.

TD-51516

Input data containing BOM (byte order mark) characters may cause Spark or Dataflow running environments to read data improperly and/or generate invalid results.

Release 8.0

January 26, 2021

What's New

APIs:

  • Individual workspace users can be permitted to create and use their own access tokens for use with the REST APIs. For more information, see Workspace Settings Page.

Connectivity:

Import:

  • Improved method for conversion and ingestion of XLS/XSLX files. For more information, see Import Excel Data.

Recipe development:

  • The Flag for Review feature enables you to set review checkpoints in your recipes. You can flag recipe steps for review by other collaborators for review and approval. For more information, see Flag for Review.

Update Macros:

  • Replace / overwrite an existing macro's steps and inputs with a newly created macro.
  • Map new macro parameters to the existing parameters before replacing.
  • Edit macro input names and default values as needed. 

Job execution:

  • You can enable the Trifacta application to apply SQL filter pushdowns to your relational datasources to remove unused rows before their data is imported for a job execution. This optimization can significantly improve performance as less data is transferred during the job run. For more information, see Flow Optimization Settings Dialog.
  • Optimizations that were applied during the job run now appear in the Job Details Page. See Job Details Page.

Changes in System Behavior

None.

Key Bug Fixes

TicketDescription
TD-57354

Cannot import data from Azure Databricks. This issue is caused by an incompatibility between TLS v1.3 and Java 8, to which it was backported.

TD-57180

AWS jobs run on Photon to publish to HYPER format fail during file conversion or writing.

New Known Issues

TicketDescription
TD-56170

The Test Connection button for some relational connection types does not perform a test authentication of user credentials.

Workaround: Append the following to your Connect String Options:

;ConnectOnOpen=true

This option forces the connection to validate user credentials as part of the connection. There may be a performance penalty when this option is used.


Release 7.10

December 21, 2020

What's New

Tip: Check out the new in-app tours, which walk you through the steps of wrangling your datasets into clean, actionable data.

Import:

  • The maximum permitted size of a file uploaded through Trifacta application has been increased from 100 MB to 1 GB.

Plan View:

  • Import and Export Plans: You can import and export plans from one environment, workspace, or projects to others.

For more information, see Export Plan.

For more information, see Import Plan.

  • Share Plans: Share plans with one or more users to work together on the same plan. For more information, see Share a Plan.
  • Email notifications: Send email notifications to plan owners and collaborators based on the status of execution of plans. For more information, see Manage Plan Notifications Dialog.

Authentication:

Connectivity:

Language:

API:

  • Experimental feature: Export Python Pandas code to generate the transformation steps required to produce a defined output object.
  • NOTE: This feature can be changed or removed from the platform at any time without notice. Do not deploy it in a production environment.
  • For more information, see API Workflow - Wrangle Output to Python.

Changes in System Behavior


Rebuild custom UDF JARs for Databricks clusters

Previously, UDF files were checked for consistency based upon the creation time of the JAR file. However, if the JAR file was passed between Databricks nodes in a high availability environment or between services in the platform, this timestamp could change, which could cause job failures due to checks on the created-at timestamps.

Beginning in this release, the platform now inserts a build-at timestamp into the custom UDF manifest file when the JAR is built. This value is fixed, regardless of the location of the copy of the JAR file.

NOTE: Custom UDF JARs that were created using earlier releases of the platform and deployed to a Databricks cluster must be rebuilt and redeployed as of this release. For more information on troubleshooting the error conditions, see Java UDFs.

Custom credential provider JAR no longer required for EMR access

In prior releases of Trifacta Wrangler Enterprise, integration with EMR required the deployment of a custom credential provider JAR file provided by the customer as part of the initial bootstrap of the EMR cluster. As of this release, this JAR file is no longer required. Instead, it is provided by the Trifacta platform directly.

NOTE: If your deployment of the Trifacta platform integrates with AWS Glue, you must still provide and deploy a custom credentials JAR file. For more information, see Enable AWS Glue Access.

For more information on integrating with EMR, see Configure for EMR.

Upgrade nodeJS

On the Trifacta node, the version of nodeJS has been upgraded to nodeJS 14.15.4 LTS. For more information, see System Requirements.


Data type and row split inference utilize more data

When a dataset is loaded, the Trifacta application now reads in more data before the type inference system and row splitting transformations analyze the data to break it into rows and columns. This larger data size should result in better data inference in the system.

NOTE: Types and row splits on pre-existing datasets may be affected by this change.

For more information, see Improvements to the Type System.

Key Bug Fixes

TicketDescription
TD-54742Access to S3 is disabled after upgrade.
TD-53527When importing a dataset via API that is sourced from a BZIP file stored on S3, the columns may not be properly split when the platform is permitted to detect the structure.

New Known Issues

TicketDescription
TD-57180

AWS jobs run on Photon to publish to HYPER format fail during file conversion or writing.

Workaround: Run the job on the Spark running environment instead.

TD-56830

Receive malformed_query: enter a filter criterion when importing table from Salesforce.

NOTE: Some Salesforce tables require mandatory filters when they are queried. Mandatory filters are not currently supported for Salesforce connections.

Release 7.9

November 16, 2020

What's New

Plan View:

  • Execute Plan using status rules: Starting in Release 7.9, you can execute tasks based on the previous task execution result. For more information, see Create a Plan.
  • Execute Parallel Plan tasks: In previous releases, plans were limited to a sequential order of task execution. Beginning in Release 7.9, you can create branches in the graph into separate parallel nodes, enabling the corresponding tasks to run in parallel. This feature enables you to have a greater level of control of your plans' workflows. For more information, see Create a Plan.
  • Zoom options: Zoom control options and keyboard shortcuts have been introduced in the plan canvas. For more information, see Plan View Page.
  • Filter Plan Runs: Filter your plan runs based on dates or plan types. For more information, see Plan Runs Page

Transform Builder:

  • An All option has been added for selecting columns in the Transform Builder.  For more information, see Changes to the Language page.

Changes in System Behavior

Manage Users section has been deprecated:

In previous releases, user management functions were available through the Manage Users section of the Admin Settings page. These functions have been migrated to the Workspace Settings page, where all of the previous functions are now available. The Manage Users section has been deprecated.

Better license management:

In prior releases, the Trifacta application locked out all users if the number of active users exceeded the number permitted by the license. This situation could occur if users were being added via API, for example. 

Beginning in this release, the Trifacta application does not block access when the number of licensed users is exceeded. 

NOTE: If you see the notification banner about license key violations, please adjust your users until the banner is removed. If you need to adjust the number of users associated with your license key, please contact Trifacta Support.

For more information, see License Key.

Trifacta Photon jobs now use ingestion for relational sources:

When a job is run on Trifacta Photon, any relational data sources are ingested into the backend datastore as a preliminary step during sampling or transformation execution. This change aligns Trifacta Photon job execution with future improvements to the overall job execution framework. No additional configuration is required.

Tip: Jobs that are executed on the Trifacta Server are executed in an embedded running environment, called Trifacta Photon. Quick Scan samples are automatically executed in Trifacta Photon.


For more information on ingestion, see Configure JDBC Ingestion.Job results page changes:

  • The dependencies tab is renamed as dependency graph tab.
  • The old flow view in the dependency graph tab is replaced with the new flow view. For more information, see Job Details Page

Key Bug Fixes

TicketDescription
TD-55125Cannot copy flow. However, export and import of the flow enables copying.
TD-53475Missing associated artifact error when importing a flow.

New Known Issues

None.

Release 7.8

October 19, 2020

What's New

Plans:


  • The viewport position and zoom level are now preserved when returning to a given flow.

Publishing:

  • Improved performance when publishing to Tableau Server.
  • Configure publishing chunk sizes as needed. For more information, see Configure Data Service.
Language:

  • Rename columns now supports uppercase or lowercase characters or shorten column names to a specified character length from the left or right. For more information, see Changes to the Language.

Connectivity:


Changes in System Behavior

JDBC connection pooling disabled:

NOTE: The ability to create connection pools for JDBC-based connections has been disabled. Although it can be re-enabled if necessary, it is likely to be removed in a future release. For more information, see Changes to Configuration.

TDE format has been deprecated:

Tableau Server has deprecated support for the TDE file format. As of this release, all outputs and publications to Tableau Server must be generated using HYPER, the replacement format for TDE. 

  • Any flow that uses TDE format is automatically switched to use HYPER format during the upgrade process.
  •  Any flow that is imported into the upgraded environment is automatically switched to using the HYPER format.

For more information, see Tableau Hyper Data Type Conversions.

Enhanced  Flow and Flow View menu options:

The context menu options for Flow View and Flow have been renamed and reorganized for a better user experience.

Key Bug Fixes

None.

New Known Issues

TicketDescription
TD-54030

When creating custom datasets from Snowflake, columns containing time zone data are rendered as null values in visual profiles, and publishing back to Snowflake fails.

Workaround: In your SELECT statement applied to a Snowflake database, references to time zone-based data must be wrapped in a function to convert it to UTC time zone. For more information, see Create Dataset with SQL.

Release 7.7

September 21. 2020

What's New

Flow View:

  • Automatically organize the nodes of your flow with a single click. See Flow View Page.

Changes in System Behavior

Deprecated Parameter History Panel Feature 

As a part of collaborative suggestions enhancement, the support for Parameter History panel is deprecated from the software. For more information on collaborative suggestions feature, see Overview of Predictive Transformation.

Classic Flow View no longer available

In Release 7.6, an improved version of Flow View was released. At the time of release, users could switch back to using the classic version. 

Beginning in this release, the classic version of Flow View is no longer available. 

Tip: The objects in your flows that were created in classic Flow View may be misaligned in the new version of Flow View. You can use auto-arrange to re-align your flow objects.

For more information, see Flow View Page.

Key Bug Fixes

TicketDescription
TD-53318
Cannot publish results to relational targets when flow name or output filename or table name contains a hyphen (e.g. my - filename.csv).

New Known Issues

None.


This page has no comments.