Page tree

 

Contents:


Release 3.2.1

This release contains numerous bug fixes and some interesting new features.

What's New

Transformer Page:

  • Specify create, append, or replace actions for your file publishing destinations. See Run Job Page.

Workspace:

Admin, Install, & Config:

Changes to System Behavior

Changes to the Language:

Changes to the Command Line Interface:

  • New file publishing options enable specifying create, append, and replace actions for file publishing destinations. 
    • output_path is now a required parameter for commands that use it. 

      NOTE: When specifying publishing options in the CLI, you may specify one file format only for the output.

  • See Changes to the Command Line Interface.

Miscellaneous Changes:

  • The setting to include headers in CSV downloads is now managed as part of the job publication workflow on a per-job basis. 
    • This setting is no longer available in the Admin Settings page.
    • For more information, see Run Job Page.
  • Access to S3 sources no longer requires the ListAllMyBuckets permission. If the permission is not granted: 
    • Users cannot see default buckets through the application.
    • Default buckets must be explicitly configured to be displayed from within the application. 
    • Users can still access unlisted buckets by directly entering the full path in the S3 browser.
    • See Enable S3 Access.

Key Bug Fixes

TicketDescription
TD-19404Split transform using at parameter values out of range of cell size generates an error in Pig.
TD-19150On Photon, unnest transform fails if pluck=true.
TD-19032Swapping rapidly between source datasets that have already been edited may cause a No samples found error. 
TD-18933You cannot load a dataset that utilizes another dataset via join or union three levels deep. 
TD-18268If you profile a wide column (one that contains many characters of data in each cell value), the machine learning service can crash.
TD-18093Changes to a dataset that generates new columns can break any downstream lookups that use the dataset.

 

New Known Issues

TicketComponentDescription
TD-20736Compilation/Execution

Publish to Redshift of single-file CSV or JSON files fails.

Workaround: Publish files to Redshift as multi-part files. See Run Job Page.

TD-19898Installer/Upgrader/Utilities

After upgrade, job card summaries in the Jobs page may fail to load for jobs executed in the pre-upgrade version with steps containing functions that have been renamed.

Workaround: You can re-run the job in the upgraded version. For more information on the renamed functions for Release 3.2.1, see Changes to the Language.

TD-19870Compilation/Execution

When publishing to S3, you cannot write to a single file in an append publishing action.

Workaround: You can change the publish action to recreate the object, replace the object, or save it as a multi-file output.

TD-19866Transformation

When switching between an append and a replace publishing action in the application, any selected compression selecting is lost. You cannot set this value again.

Workaround: Cancel the edit in progress. Re-edit the publishing action to apply the compression setting to the replace transform.

TD-19865Workspace

You cannot configure a publishing location to be a directory that does not already exist.

Workaround: Create the directory on the datastore outside of the Trifacta platform. Verify that the appropriate user accounts have access to the directory.

TD-19852Compilation/Execution

User are permitted to select compressed formats for append publish action, which is not supported.

NOTE: For Release 3.2.1, the append publish action does not support the use of compression.

TD-19827Compilation/Execution

Job execution fails with java.lang.OutOfMemoryError: unable to create new native thread exception in job log.

Workaround: You can try to raise the soft and hard limit on number of processes available to the platform. For more information, see Miscellaneous Configuration.

TD-19678Transformer Page

Column browser does not recognize when you place a checkmark next to the last column in the list.

Workaround: You can move the column to another location and then select it.


TD-19384Transformer Page

Preview cards take a long time to load when selecting values from a Datetime column.

Workaround: For selection purposes, you can change the data type to String. Then, make your selections and build your transform steps before switching back to Datetime data type.

TD-18584Type System

settype transforms that do not include a specified Datetime formatting string and its variant fail on upgrade. In previous releases, this formatting was permitted, and the variant to apply was inferred.

Workaround: Please review the variant information in the transform. Then, remove the step and re-apply the Date formatting through the Type drop-down for the column. The required type information is applied.

 

Release 3.2

This release features the introduction of the following key features:

  • A new and improved object model.
  • A completely redesigned execution engine (codename: Photon), which enables much better performance across larger samples in the Transformer page and faster execution on the Trifacta Server.

    NOTE: To interact with the Photon running environment, all desktop instances of Google Chrome must have the PNaCl component enabled and updated to the minimum supported version. See Desktop Requirements.

    NOTE: If you are upgrading from Release 3.1.x, you must manually enable the Photon running environment. If you are upgrading from an earlier version or installing Release 3.2 or later, the Photon running environment is enabled by default. See Configure Photon Running Environment.

  • The Transform Builder, a menu-driven interface for rapidly building transforms. 
  • A new publishing interface with easier, more flexible configuration.
  • Numerous other features and performance enhancements.

Details are below.

What's New

Object Model:

  • Redesigned object model and related changes to the Trifacta application enable greater flexibility in asset reuse in current and future releases. 

    NOTE: Beginning in Release 3.2, the Trifacta platform is transitioning to an enhanced object model, which is designed to support greater re-usability of objects and improved operationalization. This new object model and its related features will be introduced over multiple releases. For more information, see Changes to the Object Model.

Transformer Page:

  • A newly designed interface helps you to quickly build transform steps. See Transform Builder
  • New publishing interface with more flexible configuration options for outputs. See Run Job Page.
  • Scrolling and loading improvements in the Transformer page.
  • Substantial increase in the size of samples in Transformer page for better visibility into source data and more detailed profiling.
  • Use the Dependencies Browser to review and resolve dependency errors between your datasets. See Recipe Navigator.
  • Explore automatically detected string patterns in column data using pattern profiling and build transforms based on these patterns. See Column Details Panel.

  • Join tool now supports fuzzy join options. See Join Panel.

Admin, Install, & Config:

NOTE: The minimum system requirements for the Trifacta node have changed for this release. For more information, see System Requirements.

 

Command Line Interface:

APIs:

  • Support for end-to-end integration via API and CLI. For more information on content, please contact Trifacta Support.

Job Execution and Performance:

  • Superior performance in job execution. Run jobs on the Trifacta Server on much larger datasets and faster rate.

  • Numerous performance improvements to the web application across many users.
  • New Batch Job Runner service simplifies job monitoring and improves performance.

    NOTE: The Batch Job Runner service requires a separate database for tracking jobs. New and existing customers must manually install this database. See Install the Databases.

      

  • Improved error message on job failure.

Connectivity:

Security:

  • Numerous security enhancements.

Changes to System Behavior

This section outlines changes to how the platform behaves that have resulted from features or bug fixes in Release 3.2.

Post-Upgrade Sampling

NOTE: Due to changes in system behavior, all existing random samples for a dataset are no longer available after upgrading to this release. For any upgraded dataset, the selected sample reverts to the default sample, the first N rows of the dataset. The number of rows in the sample depends on the number of columns, data density, and other factors.

When you load your dataset into the Transformer page for the first time:

  • The first N rows of the dataset is selected as a sample.

    NOTE: The first N rows sample may change the data that is displayed in the data grid. In some cases, the data grid may initially display no data at all.

  • A new random sample is automatically generated for you.

  • The Collect New Random Sample button is available. However, until you add a script step that changes the number of rows in the dataset, this button creates a random sample that is identical to the one that is automatically created for you when you first load the dataset into the Transformer page.

Changes to  Wrangle

  • The multisplit transform has been replaced by a more flexible version of the split transform. For more information, see Split Transform.
  • Additional miscellaneous changes. See Changes to the Language.

Key Bug Fixes

TicketDescription
TD-18319Inconsistent results for DATEDIFF function across running environments. For more information, see Changes to the Language.
TD-16255windowfill function fails to fill over some empty cells.
TD-16086Job list drop-down fails to enable selection of correct jobs.
TD-16084Job cards display CLI Job source for jobs launched from the application.
TD-15609Column filtering only works if filtering value is entered in lowercase.
TD-15442

Attempt to publish to Cloudera Navigator for a Trifacta® Server job results in a DataNotFoundException.

TD-15330Pivot transform generates "Cannot read property 'primitive' of undefined" error.
TD-14541Names for private connections can collide with names of global connections, resulting in private connection unable to be edited by the owning user.
TD-14397Left or outer join against dataset with deduplicate as last script line fails in Pig execution.
TD-13162Join key selection screen and buttons are not accessible on a small desktop screen.

New Known Issues

TicketComponentDescription
TD-19150Transformer Page

On Photon, unnest transform fails if pluck=true.

Workaround: The pluck parameter forces the removal of the unnested values from the source. You may be able to use the replace transform on the source column to remove these values.

TD-19032Transformer Page

Swapping rapidly between source datasets that have already been edited may cause a No samples found error.

Workaround: Log out and log in again. Perform your dataset swap as needed.

TD-18933Transformer Page

You cannot load a dataset that utilizes another dataset via join or union three levels deep.

Example: three datasets ( Level1, Level2, Level3) each integrate ref_dataset via join. You union Level1 and Level2. Then, when you try to union those two into Level3, you get an error.

Workaround: You can generate results for the lower-level datasets and then create a new wrangled dataset from these results. However, you no longer automatically inherit changes from the source dataset(s).


TD-18836Transformer Page

find function accepts negative values for the start index. These values are consumed but produce unexpected results.

Workaround: Use non-negative values as inputs.

TD-18746Transformer Page

When Photon is enabled, previews in the data grid may take up to 30 seconds to dismiss.

Workaround: This issue is related to the display of suggestion cards. Although it's not an ideal solution, you can experiment with disabling the display of preview cards in the data grid options menu. See Data Grid Panel.

TD-18538Connectivity

Platform fails to start if Trifacta user for S3 access does not have the ListAllMyBuckets permission.

Workaround: Please verify that this user has the appropriate permissions.

TD-18288Installer/Upgrader

In Release 3.1.2 and earlier, any datasource that has never been used to create a dataset is no longer available after upgrade.

Workaround: The assets remain untouched on the datastore where located. As long as the user has read permissions to the datastore area, the assets can be re-imported into the platform for Release 3.2 and later.

TD-18268Transformer Page

If you profile a wide column (one that contains many characters of data in each cell value), the machine learning service can crash.

Workaround: Restart the machine learning service. If visual profiling of the column is important, look to split the column into separate columns and then profile each one individually.

TD-18093Transformer Page - Tools

Changes to a dataset that generates new columns can break any downstream lookups that use the dataset.

Workaround: If the lookup breaks, you can recreate it.

TD-17713Connectivity

Preview of Hive tables intermittently fails to show table data. When you click the Eye icon to preview Hive table data, you might see a spinner icon.

Workaround: To workaround, preview data on another Hive table. Then, preview the data on the first table again. If you do not have another table to preview, try previewing the Hive table three times, which might work.

TD-17677Administration/Configuration

Remove references to Zookeeper in the platform.

Workaround: As of Release 3.2, the Trifacta platform no longer requires access to Zookeeper. However, removal of all references in the platform requires more work, which will be completed in a future release.

TD-17657Transform Builder

splitrows transform allows splitting even if required parameter on is set to an empty value.

 

Workaround: Make sure you specify a valid value for on.

TD-17333Transformer Page

Sorting on a Datetime column places 00:00:00 values at the bottom of the list when operating on the Javascript running environment.

Workaround: This issue does not appear in the Photon running environment or in jobs executed in Photon or Hadoop Pig. See Configure Photon Running Environment.

TD-16419Transform BuilderComparison functions added through Builder are changed to operators in recipe
TD-15858Connectivity

Importing a directory of Avro files only imports the first file when the Photon running environment is enabled.

Workaround: You can try re-exporting from the source system in a different format or importing the data when the JavaScript-based running environment is enabled. For more information on how to re-enable, see Configure Photon Running Environment.

TD-14622Script Infrastructure

Python and Java UDFs accept inputs with zero parameters.

Workaround: Insert a dummy parameter as part of the input.

TD-14131Compilation/Executionsplitrows transform does not work after a backslash.
TD-12283Installer/Upgrader/Utilities

Platform cannot execute jobs on Pig that are sourced from S3, if OpenJDK is installed. 

Workaround: Install Oracle JDK 1.8 before installing the Trifacta platform. See System Requirements.

This page has no comments.