Page tree

Trifacta Dataprep



Contents:

   

Contents:


These release notes apply to the following product tiers of Dataprep by Trifacta®:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Starter Edition by Trifacta
  • Dataprep Premium by Trifacta
  • Dataprep Standard by Trifacta
  • Dataprep Legacy by Trifacta

Tip: You can see your product tier in the Trifacta application. Select Help menu > About Cloud Dataprep.

For more information, see Product Editions.

For release notes from previous releases, see Earlier Releases of Cloud Dataprep.

September 15, 2021

Release 8.7

What's New

Templates:

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta®
  • Dataprep Professional Edition by Trifacta
  • Dataprep Starter Edition by Trifacta
  • Dataprep Premium by Trifacta
  • Dataprep Standard by Trifacta


From the Flows page, you can now access pre-configured templates directly from the templates gallery.

Tip: Click Templates in the Flows page. Select the template, and the template is opened in Flow View for you.

Browsers:

  • Update to supported browsers:
    • Mozilla Firefox is generally supported.
    • Microsoft Edge is now supported.

      NOTE: This feature is in Beta release.
    • New versions of supported browsers are now supported.
    • For more information, see Browser Requirements.

Plans:

  • Create plan tasks to deliver messages to a specified Slack channel.

    Feature Availability: This feature is available in the following editions:

    • Dataprep Enterprise Edition by Trifacta®
    • Dataprep Professional Edition by Trifacta
    • Dataprep Premium by Trifacta

    For more information, see Create Slack Task.

Import data:

  • When you are importing from or writing to  Cloud Storage, you can choose to display hidden files and folders for access to them.

    Tip: Use this option to access files generated for your job's visual profile and then publish them to BigQuery for additional analysis.

    For more information, see Import Data Page.

Sharing:


Publishing:

  • Strict type matching for publishing to BigQuery Datetime columns. 

    Tip: You can enable or disable strict type matching during publication to BigQuery. Strict type matching is enabled by default for new flows. You can disable the flag to revert to previous BigQuery publishing behaviors. See BigQuery Table Settings.

    For more information, see BigQuery Data Type Conversions.

Recipe panel:

Changes

None.

Deprecated

API:

  • Deprecated API endpoint to transfer assets between users has been removed from the platform. This endpoint was previously replaced by an improved method of transfer.
  • Some connection-related endpoints have been deprecated. These endpoints have little value for public use.
  • For more information, see Changes to the APIs.

Known Issues

TicketDescription
TD-63517

Unpivoting a String column preserves null values in Dataflow but converts them to empty strings in Photon. Running jobs on the different running environments generates different results.

Workaround: After the unpivot step, you can add an Edit with formula step. Set the columns to all of the columns in the unpivot and add the following formula, which converts all missing values to null values:

if(ismissing($col),NULL(),$col)



Fixes

TicketDescription
TD-63564

Schedules created by a flow collaborator with editor access stop working if the collaborator is removed from the flow.

Collaborators with viewer access cannot create schedules.


August 16, 2021

Release 8.6

What's New

Template Gallery:

Tip:  You can start a trial account by selecting a pre-configured template from our templates gallery. See  www.trifacta.com/templates.  

Collaboration:

Connectivity:

  • Early Preview (read-only) connections available with this release:

    Feature Availability: This feature is available in the following editions:

    • Dataprep Enterprise Edition by Trifacta
    • Dataprep Professional Edition by Trifacta
    • Dataprep Premium by Trifacta

Performance:

  • Conversion jobs are now processed asynchronously. 

  • Better management of file locking and concurrency during job execution. 

Better Handling of JSON files:

The Trifacta application now supports the regularly formatted JSON files during import. You can now import flat JSON records contained in a single array object. With this, each array is treated as a single line and imported as a new row. For more information, see Working with JSON v2

Usage reporting:

Detailed reporting on vCPU and active users is now available in the Trifacta application.

NOTE:  Active user reporting may not be available until September 1, 2021 or later.

For more information, see Usage Page.

Changes

Dataflow machines:

  • The following machine types are now available when running a Dataflow job:

    "e2-standard-2",
    "e2-standard-4",
    "e2-standard-8",
    "e2-standard-16",
    "e2-standard-32"

Deprecated

None.

Known Issues

  • TD-63564: Schedules created by a flow collaborator with editor access stop working if the collaborator is removed from the flow.

    • Tip: Flow owners can delete the schedule and create a new one. When this issue is fixed, the original schedule will continue to be executed under the flow owner's account.

    • Collaborators with viewer access cannot create schedules.

Fixes

  • TD-61478: Time-based data types are imported as String type from BigQuery sources when type inference is disabled.

July 20, 2021

Release 8.5

What's New

Tip: When you complete your Dataprep Enterprise Edition by Trifacta or Dataprep Professional Edition by Trifacta trial, you can choose to license a higher or lower tier product edition. For more information, see Product Editions.

Parameterization:

  • Create environment parameters to ensure that all users of the project or workspace use consistent references.

    NOTE: You must be a workspace administrator or project owner to create environment parameters.

    Tip: Environment parameters can be exported from one project or workspace and imported into another, so that these references are consistent across the enterprise.

  • Parameterize names of your storage buckets using environment parameters.

Schedules:

  • Project owners and workspace administrators can review, enable, disable, and delete schedules through the application.

    Feature Availability: This feature is not available in
    Dataprep Starter Edition by Trifacta only.

    See Schedules Page.

Flow View:

Job execution:

  • Define SQL scripts to execute before data ingestion or after publication for file-based or table-based jobs.

Resource usage:

  • Review the total vCPU hours consumed by job execution within your project across an arbitrary time period.

Connectivity:

Contribute to the future direction of connectivity: Click I'm interested on a connection card to upvote adding the connection type to the Trifacta application. See Create Connection Window.

  • Early Preview (read-only) connections available with this release:

    Feature Availability: This feature is available in the following editions:

    • Dataprep Enterprise Edition by Trifacta
    • Dataprep Professional Edition by Trifacta
    • Dataprep Premium by Trifacta

  • Apache Impala

Connectivity:

  • Connect to your relational database systems hosted on Cloud SQL. In the Connections page, click the Cloud SQL card for your connection type.
    Feature Availability: This feature is available in the following editions:

    • Dataprep Enterprise Edition by Trifacta®
    • Dataprep Professional Edition by Trifacta
    • Dataprep Premium by Trifacta


    For more information, see Create Connection Window.

Connectivity:

API:

  • Cancel in-progress Dataflow jobs via API.

    Feature Availability: This feature is available in the following editions:

    • Dataprep Enterprise Edition by Trifacta
    • Dataprep Professional Edition by Trifacta
    • Dataprep Premium by Trifacta
    • Dataprep Standard by Trifacta

    See Changes to the APIs.

Job execution:

You can choose to ignore the recipe errors before job execution and then review any errors in the recipe through the Job Details page.

Language:

  • NUMVALUE function can be used to convert a String value formatted as a number into an Integer or Decimal value.
  • NUMFORMAT function now supports configurable grouping and decimal separators for localizing numeric values.
  • For more information, see Changes to the Language.

Performance:

  • Improved performance when browsing folders containing a large number of files on  Cloud Storage

Resource usage:

  • Review the total vCPU hours consumed by your datasets, recipes, and job execution within your project across an arbitrary time period. 

Changes

None.

Deprecated

None.

Known Issues

None.

Fixes

  • TD-62190: You may not be able to view the SQL that was used to execute a job within BigQuery. This issue is due to a regression in the new BigQuery console in which job identifiers containing dashes are not supported. A ticket has been filed with Google.

June 7, 2021

Release 8.4

What's New

Template Gallery:

  • Check out the new gallery of flow templates, which can be imported into your workspace. These templates are pre-configured to solve the most compelling loading and transformation use cases in the product. For more information, see www.trifacta.com/templates.
    • For more information on importing flows into your workspace, see Import Flow.
    • For more information on using a template in the product, see Start with a Template

Connectivity:

  • Early Preview (read-only) connections available with this release:

    Feature Availability: This feature is available in the following editions:

    • Dataprep Enterprise Edition by Trifacta
    • Dataprep Professional Edition by Trifacta
    • Dataprep Premium by Trifacta

  • Splunk
  • YouTube Analytics

Collaboration:


Support for delete actions on merge (upsert) operations in BigQuery:

When publishing to a BigQuery table, you can choose to update or, with this release, to delete matching records during a merge option. For more information, see BigQuery Table Settings.

Job execution:

You can choose to ignore the recipe errors before job execution and then review any errors in the recipe through the Job Details page.

Language:

Changes

Trifacta Photon limits on execution time

Trifacta Photon is an in-memory running environment that is hosted on the same node as Dataprep by Trifacta, which allows for faster execution suitable for small- to medium-sized jobs.

Feature Availability: This feature is not available in
Dataprep Legacy by Trifacta only.

NOTE: Jobs that are executed on Trifacta Photon may be limited to run for a maximum of 10 minutes, after which they fail with a timeout error. If your job fails due to this limit, please switch to running the job on Dataflow.

Trifacta Photon can be enabled or disabled by a project administrator. For more information, see Dataprep Project Settings Page.

Execution of scheduled jobs on Trifacta Photon is not supported

In conjunction with the previous change, execution of scheduled jobs is not supported on Trifacta Photon. Since Trifacta Photon jobs are now limited to 10 minutes of execution time, scheduled jobs have been automatically migrated to execution on Dataflow to provide better execution success. For more information, see Trifacta Photon Running Environment.

Deprecated

None.

Known Issues

  • TD-62190: You may not be able to view the SQL that was used to execute a job within BigQuery. This issue is due to a regression in the new BigQuery console in which job identifiers containing dashes are not supported. A ticket has been filed with Google.

Fixes

  • TD-60881:  Incorrect file path and missing file extension in the application for parameterized outputs
  • TD-60382: Date format M/d/yy is handled differently by PARSEDATE function on Trifacta Photon and Spark.

May 20, 2021

Release 8.3 - push 3

What's New

Connectivity:

  • Support for SFTP connections.

    Feature Availability: This feature is available in the following editions:

    • Dataprep Enterprise Edition by Trifacta
    • Dataprep Professional Edition by Trifacta
    • Dataprep Premium by Trifacta


    NOTE: This connection type is import only.

    For more information, see SFTP Connections.

Changes

Trifacta Photon enabled by default

Trifacta Photon is an in-memory running environment that is hosted on the same node as Dataprep by Trifacta, which allows for faster execution suitable for small- to medium-sized jobs.

Feature Availability: This feature is not available in
Dataprep Legacy by Trifacta only.

NOTE: Jobs executed in Trifacta Photon are executed within the Trifacta VPC. Data is temporarily streamed to the Trifacta VPC during job execution and is not persisted.

Beginning in this release, Trifacta Photon is enabled by default. Users can choose to run jobs on Trifacta Photon.

NOTE: For Dataprep Enterprise Edition by Trifacta, Trifacta Photon is enabled by default for new projects. For existing projects, a project administrator must still choose to enable it.

Trifacta Photon can be enabled or disabled by a project administrator. For more information, see Dataprep Project Settings Page.

Deprecated

None.

Known Issues

None.

Fixes

None.

May 10, 2021

Release 8.3

What's New

Running Environments:

Cancel Jobs in Dataflow:

You can cancel  Dataflow jobs directly from the product.

NOTE: In some cases, the product is unable to cancel the job from the application. In these cases, click View in Dataflow Job and from there you can cancel the job in progress .

Support for merge (upsert) operations in BigQuery:

When publishing to a BigQuery table, you can choose to write results using the merge option. When selected, you specify a primary key of fields and then decide how data is merged into the table. For more information, see BigQuery Table Settings.

Connectivity:

  • Early Preview (read-only) connections available with this release:

    Feature Availability: This feature is available in the following editions:

    • Dataprep Enterprise Edition by Trifacta
    • Dataprep Professional Edition by Trifacta
    • Dataprep Premium by Trifacta

  • Authorize.net
  • Cockroach DB
  • DB2
  • Google Data Catalog
  • Google Spanner
  • Magento
  • Redis
  • Shopify
  • Smartsheet
  • Trello
  • QuickBase

Job execution:

Introducing new filter pushdowns to optimize the performance of your flows during job execution. For more information, see Flow Optimization Settings Dialog.

Job results:

You can now preview job results and download them from the Overview tab of the Job details page. For more information, see Job Details Page.

Tip: You can also preview job results in Flow View. See View for Outputs.

Changes

Improved method of JSON import

Beginning in this release, the Trifacta application now uses the conversion service to ingest JSON files during import. This improved method of ingestion can save significant time wrangling JSON into records.

NOTE: The new method of JSON import is enabled by default but can be disabled as needed.

For more information, see Working with JSON v2.

Flows that use imported datasets created using the old method continue to work without modification.

NOTE: It is likely that support for the v1 version of JSON import is deprecated in a future release. You should switch to using the new version as soon as possible. For more information on migrating your flows and datasets to use the new version, see Working with JSON v1.

Future work on support for JSON is targeted for the v2 version only.

Optionally, you can re-enable the old version, which is useful for migrating to the new version.

Feature Availability: This feature is not available in
Dataprep Legacy by Trifacta only.

For more information on using the old version and migrating to the new version, see Working with JSON v1.

Deprecated

None.

Known Issues

  • TD-61478: Time-based data types are imported as String type from BigQuery sources when type inference is disabled.

Fixes

  • TD-60701: Most non-ASCII characters incorrectly represented in visual profile downloaded in PDF format.
  • TD-59854: Datetime column from Parquet file incorrectly inferred to the wrong data type on import.

April 26, 2021

Release 8.2 push2

What's New

Upgrade: Trial customers can upgrade through the Admin console. See Admin Console.

This is the initial release of for the following product tiers:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Starter Edition by Trifacta

Changes

None.

Deprecated

None.

Known Issues

None.

Fixes

None.

April 14, 2021

Release 8.2

This is the initial release of for the following product tiers:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Starter Edition by Trifacta

What's New

Photon:

Introducing Trifacta Photon, an in-memory running environment for running jobs. Embedded in the Dataprep by Trifacta, Trifacta Photon delivers improved performance in job execution and is best-suited for small- to medium-sized jobs.

Feature Availability: This feature is not available in
Dataprep Legacy by Trifacta only.

NOTE: Trifacta Photon must be enabled by a project owner. For more information, see Dataprep Project Settings Page.

  • When you choose to run a job, you can now choose to run a job on Trifacta Photon.
  • For more information, see Run Job Page .

Quick scan sampling:

  • Trifacta Photon also enables quick scan sampling. A quick scan sample generates an appropriate selection of rows from the dataset from which the sample was initiated. These samples are faster to generate. For more information, see Overview of Sampling.
  • For more information on generating samples, see Samples Panel.

Preferences:

  • Re-organized user account, preferences, and storage settings to streamline the setup process. See Preferences Page.

Connectivity:

  • Early Preview (read-only) connections available with this release:

    Feature Availability: This feature is available in the following editions:

    • Dataprep Enterprise Edition by Trifacta
    • Dataprep Professional Edition by Trifacta
    • Dataprep Premium by Trifacta

Plan metadata references:

Feature Availability: This feature is available in the following editions:

  • Dataprep Enterprise Edition by Trifacta
  • Dataprep Professional Edition by Trifacta
  • Dataprep Premium by Trifacta

Use metadata values from other tasks and from the plan itself in your HTTP task definitions.


Improved accessibility of job results:

The Jobs tabs have been enhanced to display the list of latest and the previous jobs that have been executed for the selected output.

For more information, see View for Outputs.

Sample Jobs Page:

You can monitor the status of all sample jobs that you have generated. Project administrators can access all sample jobs in the workspace. For more information, see Sample Jobs Page.

Simplified output and destination experience:

From the Home Page, you can quickly redesign your output and destination experience. The step-by-step procedures enables you to create an improved and streamlined output creation experience. For more information, see Start with a Template.

Changes

Improved methods for disabling the product:

Project owners can choose to disable Dataprep by Trifacta from within the product. For more information, see Enable or Disable Dataprep.

After the product has been disabled in a project, Trifacta data is placed in a hidden state for later purging. For more information on purging or restoring data, see Wipe Out Dataprep Data.

API:

The following API endpoints are scheduled for deprecation in a future release:

NOTE: Please avoid using the following endpoints.

/v4/connections/vendors
/v4/connections/credentialTypes
/v4/connections/:id/publish/info
/v4/connections/:id/import/info

These endpoints have little value for public use.

Deprecated

None.

Known Issues

  • TD-60701: Most non-ASCII characters incorrectly represented in visual profile downloaded in PDF format.

Fixes

  • TD-59236: Use of percent sign (%) in file names causes Transformer page to crash during preview.
  • TD-59218:  BOM characters at the beginning of a file causing multiple headers to appear in Transformer Page.


Earlier Releases

For release notes from previous releases, see Earlier Releases of Cloud Dataprep.

This page has no comments.