December 7, 2020
- Updated supported versions of EMR to address log4j2 issues. Please upgrade to EMR 5.30.2.
- For more information, see https://docs.trifacta.com/display/PUB/Trifacta+Alert+TD-67372+-+0-Day+Exploit+in+log4j2+for+Self+Managed.
Support for PostgreSQL 12.3 for Trifacta databases on all supported operating systems.NOTE: This feature is in Beta release.
NOTE: Support for PostgreSQL 9.6 will be deprecated in a future release.
- Support for configurable endpoints in AWS GovCloud. See Configure for AWS.
Changes in System Behavior
Installation of database client is now required:
Beginning in this release, before you install or upgrade the database or perform any required database cross-migrations, you must install the appropriate database client first.
NOTE: Use of the database client provided with each supported database distribution is now a required part of any installation or upgrade of the Trifacta platform.
NOTE: The MySQL database client cannot be provided by Trifacta. It must be downloaded and installed separately. As a result, installation or upgrade of a Docker environment using MySQL requires additional support. For more information, please contact Trifacta Customer Success Services.
For more information:
Catalog support to be deprecated:
NOTE: Integrations with Alation and Waterline catalogs are likely to get deprecated in a future release.
Support for custom data types based on dictionary files to be deprecated:
NOTE: The ability to upload dictionary files and use their contents to define custom data types is scheduled for deprecation in a future release. This feature is limited and inflexible. Until an improved feature can be released, please consider using workarounds. For more information, see Validate Your Data.
You can create custom data types using regular expressions. For more information, see Create Custom Data Types.
Maintenance release updater script is deprecated:
The maintenance release updater script has been deprecated. This script could be used for performing maintenance upgrades:
- Release X.Y.1 to Release X.Y.2
- Hot Fixes
New Known Issues
Cannot run jobs on some builds HDP 2.6.5 and later. There is a known incompatibility between HDP 126.96.36.1997-2 and later and the Hadoop bundle JARs that are shipped with the Trifacta installer.
Solution: The solution is to use an earlier compatible version. For more information, see Configure for Hortonworks.
Cannot import data from Azure Databricks. This issue is caused by an incompatibility between TLS v1.3 and Java 8, to which it was backported.
This issue is known to impact Marketplace installs of Trifacta Self-Managed Enterprise Edition and can impact on-premises installs.
Workaround: The solution is to downgrade Java on the Trifacta node to openJDKv1.8.0_212 or earlier. Java 8 is required. After you have downgraded, restart the platform. For more information, see System Requirements.
Non-default admin users are not automatically granted full workspace admin privileges on upgrade. These users may be able to see Workspace Settings and Admin Settings but are not granted access to edit roles and users.
Workaround: Login as the default admin user. Select User menu > Admin Console > Roles. For the Workspace Admin role, select Assign Role. Assign the role to the non-default admin users.
For more information:
Access to S3 is disabled after upgrade.
Workaround: This issue is caused by the migration of the S3 enablement setting into the Workspace Settings page. To address, set
When importing a dataset via API that is sourced from a BZIP file stored on a backend datastore such as S3, WASB, or ADLS Gen1/Gen2, the columns may not be properly split when the platform is permitted to detect the structure.
Workaround: Import the dataset via UI. If you must still import via API, please change webapp.loadLimitForSplitInference to 900000. See Admin Settings Page.
September 7, 2020
New Flow View is now generally available:
- Drag and drop to reposition objects on the Flow View canvas, and zoom in and out to focus on areas of development.
- Perform joins and unions between objects on the Flow View canvas.
- Search for flow objects by name or by type.
Annotate the canvas with notes.
Tip: The relative position of objects on the flow view canvas is preserved between screen updates. On refresh, the window on the canvas is repositioned based on the leftmost object on the canvas to focus on the flow to other objects from that one.
NOTE: Classic Flow View is no longer available.
See Flow View Page.
Support for PostgreSQL 12.3 for Trifacta databases on CentOS/RHEL 7.NOTE: This feature is in Beta release.
Support for Cloudera Data Platform.
NOTE: Installation requirements for Cloudera Data Platform are consistent with installation for CDH. The Trifacta platform must be installed on a pre-existing Cloudera Data Platform.
There are minor differences in configuration. For more information, see Configure for Cloudera.
- Support for high availability on AWS. For more information, see Install for High Availability on AWS.
- On-premises installations can be deployed in a highly available environment. For more information, see Install for High Availability.
- Support for high availability integration with EMR clusters. For more information, see Configure for EMR.
- Supoprt for Spark 2.4.6. For more information, see Configure for Spark.
Support for EMR 5.30.1.
NOTE: Avoid EMR 5.30.0. Instead, please use EMR 5.30.1.
See Configure for EMR.
For long-loading relational datasets, you can monitor the ingest process through Flow View as you continue your work.
NOTE: This feature may require enablement in your deployment. For more information, see Configure JDBC Ingestion.
For more information, see Flow View Page.
Improved performance when browsing databases for tables to import.
Tip: Performance improvements are due to limiting the volume of table metadata that is imported when paging through available tables. This metadata can be retrieved when you hover over a table in the database browser.
For more information, see Database Browser.
Logical and physical optimizations when reading from relational sources during job execution, which includes column pruning push-down among other enhancements.
NOTE: This feature may need to be enabled in your workspace. See Workspace Settings Page.
This feature applies to the following relational connections in this release:
- Configure advanced settings on your flow and its job executions. See Manage Flow Advanced Settings Dialog.
- Apply overrides to recipe parameters for your plans. See Plan View Page.
- New Plan Runs page:
- Monitor status of all of your plan runs and drill into details.
- Download logs for plan runs and individual flow tasks in the run.
- See Plan Runs Page.
Collaborative suggestions allow users within a workspace to receive suggestions based on the transformations that have been recently created by themselves or by all members of the workspace. As more users generate transformations, the relevance of these suggestions to the data in the workspace continues to improve.
- Create and edit flow parameters and their overrides while editing your recipe. See Transformer Page.
- For more information on editing flow parameters, see Manage Parameters Dialog.
Support for job cancellation on EMR clusters. See Jobs Page.
NOTE: Additional configuration may be required. For more information, see Configure for EMR.
- Azure Databricks enhancements:
- Support for creating clusters using instance pools across multiple Databricks workspaces using instance pooling and Databricks pool names.
- Manage jobs on Azure Databricks to prevent reaching Databricks workspace limits.
- See Configure for Azure Databricks.
- When profiling is enabled for a Spark job, the transformation and profiling steps are combined into a single task, which optimizes the execution of transform and profiling tasks for a Spark job. For more information, see Configure for Spark.
- Workspace administrators can now create and assign roles to govern access to types of objects in the workspace. For more information, see Changes to User Management.
- Support for configurable Azure AD endpoint and authority, including Gov Cloud. For more information, see Configure SSO for Azure AD.
- Improved performance for Oracle and SQL Server connections. These performance improvements will be applied to other relational connections in future releases.
- Improved performance when reading from Hive with many partitions into the Transformer page.
- New approximation functions for median, percentile, and quartile based on a very fast algorithm.
- New functions to encode and decode base64 strings.
- New weekday name function.
- New rolling window date functions.
- New Kth-largest functions.
- New conditional minimum, maximum, and mode date functions.
- See Changes to the Language.
- Additional connect string options and troubleshooting information has been included for specific relational connections. For more information, see Connection Types.
Changes in System Behavior
End of Life for Wrangler Enterprise desktop application
The Wrangler Enterprise desktop application is no longer available for installation and is not supported for use with the product. Please use one of the supported browser versions instead. For more information, see Desktop Requirements.
Users section of Admin Settings is disabled
In previous releases, the Users section of the Admin Settings page was used to manage users.
- Beginning in this release, the above area has been replaced by the Workspace Users page, where almost all user management tasks can be performed.
- The Users section must still be used for assigning the Trifacta admin platform role and the SSO, Hadoop, or Kerberos principals through the Trifacta application. It can be enabled as needed.
- For more information, see Changes to User Management.
CentOS/RHEL 7.1 and 7.2 deprecated
Please upgrade to a supported distribution of either operating system. For more information, see System Requirements.
S3 access uses Java VFS service
Access to S3 is now managed through the Java-based virtual file system. For more information, see Configure Java VFS Service.
NOTE: No configuration changes are required for upgrading customers. For more information, see Enable S3 Access.
Schema information is retained
When schematized datasources are ingested, schema information is now retained for publication of job results.
NOTE: In prior releases, you may have set column data types manually because this schema information was lost during the ingest process. You may need to remove these manual steps from your recipe. For more information, see Improvements to the Type System.
Enhanced PII masking
For social security numbers and credit card numbers, the methods by which these values are determined for purposes of masking sensitive Personally Identifiable Information (PII) has been expanded and improved. For more information, see Improvements to the Type System.
Updated credential types for connections via API
- Redshift connections now require a different credential type.
- Snowflake connections now require a different credential type.
- See Changes to the APIs.
Optimizer service and database: During job execution on relational sources, the optimizer service assists in managing SQL queries efficiently so that smaller volumes of data are retrieved for the job. Queries are stored in the related database.
Key Bug Fixes
API: Unable to update awsConfig objects in per-user or per-workspace modes.
|TD-51229||When an admin user shares a flow that the admin user owns, a |
|TD-48915||Inserting special characters in an output filename results in a validation error in the the application and job failures.|
|TD-47696||Platform appears to fail to restart properly through Admin Settings page due to longer restarts of individual services.|
|TD-49559||Cannot select and apply custom data types through column Type menu.|
|TD-47473||Uploaded files (CSV, XLS, PDF) that contain a space in the filename fail to be converted.|
|TD-34840||Platform fails to provide suggestions for transformations when selecting keys from an object with many of them.|
New Known Issues
Import of dataset from Alation catalog hangs.
NOTE: The Alation catalog integration is not working in Release 7.6. For more information, please contact Trifacta Support.
If a flow is unshared with you, you cannot see or access the datasources for any jobs that you have already run on the flow. You can still access the job results.
This page has no comments.