These release notes apply to the following product tiers of :

Tip: You can see your product tier in the . Select Help menu > About Cloud Dataprep.

iFor more information, see Product Editions.

eFor release notes from previous releases, see Earlier Releases of Cloud Dataprep.

May 10, 2021

Release 8.3

What's New

Running Environments:

Cancel Jobs in :

You can cancel  jobs directly from the product.

NOTE: In some cases, the product is unable to cancel the job from the application. In these cases, click View in Dataflow Job and from there you can cancel the job in progress.

Support for merge (upsert) operations in BigQuery:

When publishing to a BigQuery table, you can choose to write results using the merge option. When selected, you specify a primary key of fields and then decide how data is merged into the table. For more information, see BigQuery Table Settings.

Connectivity:



Job execution:

Introducing new filter pushdowns to optimize the performance of your flows during job execution. For more information, see Flow Optimization Settings Dialog.

Job results:

You can now preview job results and download them from the Overview tab of the Job details page. For more information, see Job Details Page.

Tip: You can also preview job results in Flow View. See View for Outputs.

Changes

Improved method of JSON import

Beginning in this release, the  now uses the conversion service to ingest JSON files during import. This improved method of ingestion can save significant time wrangling JSON into records.

NOTE: The new method of JSON import is enabled by default but can be disabled as needed.

For more information, see Working with JSON v2.

Flows that use imported datasets created using the old method continue to work without modification.

NOTE: It is likely that support for the v1 version of JSON import is deprecated in a future release. You should switch to using the new version as soon as possible. For more information on migrating your flows and datasets to use the new version, see Working with JSON v1.

Future work on support for JSON is targeted for the v2 version only.

Optionally, you can re-enable the old version, which is useful for migrating to the new version.

For more information on using the old version and migrating to the new version, see Working with JSON v1.


Deprecated

None.

Known Issues

Fixes

April 14, 2021

Release 8.2

What's New

Photon:

Introducing, an in-memory running environment for running jobs. Embedded in the, delivers improved performance in job execution and is best-suited for small- to medium-sized jobs.

This is the initial release of for the following product tiers:


NOTE: must be enabled by a project owner. For more information, see Dataprep Project Settings Page.

Quick scan sampling:

Preferences:


Connectivity:


Plan metadata references:

Use metadata values from other tasks and from the plan itself in your HTTP task definitions.


Improved accessibility of job results:


The Jobs tabs have been enhanced to display the list of latest and the previous jobs that have been executed for the selected output.

For more information, see View for Outputs.

Sample Jobs Page:

You can monitor the status of all sample jobs that you have generated. Project administrators can access all sample jobs in the workspace. For more information, see Sample Jobs Page.

Simplified output and destination experience:

From the Home Page, you can quickly redesign your output and destination experience. The step-by-step procedures enables you to create an improved and streamlined output creation experience. For more information, see Start with a Template.


Changes

Improved methods for disabling the product:

Project owners can choose to disable from within the product. For more information, see Enable or Disable Dataprep.

After the product has been disabled in a project, is placed in a hidden state for later purging. For more information on purging or restoring data, see Wipe Out Dataprep Data.

API:

The following API endpoints are scheduled for deprecation in a future release:

NOTE: Please avoid using the following endpoints.


/v4/connections/vendors
/v4/connections/credentialTypes
/v4/connections/:id/publish/info
/v4/connections/:id/import/info

These endpoints have little value for public use.


Deprecated

None.

Known Issues

Fixes


March 16, 2021

Release 8.1

What's New

Connectivity:

Specify column headers during import:

You can specify the column headers for your dataset during import. For more information, see Import Data Page.

Sample Jobs Page:

You can monitor the status of all sample jobs that you have generated. Project administrators can access all sample jobs in the workspace. For more information, see Sample Jobs Page.

Job results:

Results of data quality checks are now part of the visual profile PDF available with your job results. In the PDF, you can download the data quality results over the entire dataset .


Sharing:

API:

Macro updates:

You can replace an existing macro definition with a macro that you have exported to your local desktop.

NOTE: Before you replace the existing macro, you must export a macro to your local desktop. For more information, see Export Macro.

For more information, see Macros Page.

Changes

Freed IP address ranges:

The following IP address range is the only one in use by the :

34.68.114.64/28


Please discontinue whitelisting any other IP address ranges for the .

These ranges have been freed to the general Internet.

Changes to Preferences:

The Preferences area of the has been changed. For more information, see Changes to Configuration.

Deprecated

None.

Known Issues

Fixes

February 16, 2021

Release 8.0

Features

Tip: Add a profile picture to your account! For more information, see User Profile Page.

Flow templates:

Introducing flow templates, which are predefined flows with guidelines for creating the flow objects needed to solve a specific transformation and publication use case. These step-by-step guides leverage placeholders for flow objects to assist you in rapidly assembling your end-to-end flow pipeline.

The first available template simplifies the Data Warehouse Onboarding process, which simplifies the ingestion of datasets, transformation of them, and loading them into your data warehouse. From the Home page, you can quickly set up a pipeline from data lakes into data warehouses:

Authorization:

APIs:

Connectivity:

Import:

Recipe development:

Metric-based data quality rules:

Update Macros:

Job execution:

Changes

None.

Deprecated

None.

Known Issues


Fixes

None.


January 12, 2021

Release 7.10

Features

In-app chat: Have a question about the product? Use the new in-app chat feature to explore content or ask a question to our support staff. If you need assistance, please reach out!

In-app tours: Check out the new in-app tours, which walk you through the steps of wrangling your datasets into clean, actionable data. 

Import: The maximum permitted size of a file uploaded through  has been increased from 100 MB to 1 GB.

Import and Export Plans: You can import and export plans from one environment, workspace, or projects to others.

Share Plans : Share plans with one or more users to work together on the same plan.

Email notifications : Send email notifications to plan owners and collaborators based on the status of execution of plans.

Connectivity: Improved Salesforce connection type.

Changes

IP address range whitelist:

Changes to IAM roles for service accounts: Recently, Google announced changes to permissions required for attaching IAM roles to service accounts. If you are using IAM roles for your Google service accounts, these changes may require updating the permissions that you must enable in your IAM roles. For more information, see Changes to User Management.

Enable listing all users in the workspace: Beginning in this release, workspace administrators can choose to enable or disable the listing of all workspace users in the   . For example, if you are sharing a flow with another user and this feature is enabled, you can browse the list of all workspace users and select users with whom to share.

Deprecated

None.

Known Issues

None.

Fixes

TD-53527:  When importing a dataset via API that is sourced from a BZIP file stored on S3, the columns may not be properly split when the platform is permitted to detect the structure.


December 14, 2020

Release 7.9

Features

In-app chat: Have a question about the product? Use the new in-app chat feature to explore content or ask a question to our support staff. If you need assistance, please reach out!

Plan View: Execute branching and parallel tasks, using success/failure criteria to determine next steps:

Transform Builder: An All option has been added for selecting columns in the Transform Builder.  For more information, see Changes to the Language.

API access tokens: Individual project users can now be permitted to create and use their own access tokens for use with the REST APIs. For more information, see Dataprep Project Settings Page.

Manage data access: You can control user access to datastores such as  and BigQuery based on finer-grained permissions assigned to the user's IAM role.

Changes

Changes to permissions: The set of required and optional permissions has changed for .

Job results page changes: The old flow view in the dependency graph tab is replaced with the new flow view.

Optimizer service re-enabled: In the September release, the Optimizer service was introduced, but an issue was discovered, which caused us to disable the service temporarily. This issue has been fixed.

Deprecated

None.

Known Issues

None.

Fixes

TD-53475:  Missing associated artifact error when importing a flow.


November 17, 2020

Release 7.8

Features

Plans:

Flow View:

Data quality rules:

Language:

APIModify the source  bucket and path for a defined imported dataset.

For more information, see API Workflow - Swap Datasets.

Changes

JDBC connection pooling disabled: The ability to create connection pools for JDBC-based connections has been disabled. It is likely to be removed in a future release.

Deprecated Parameter History Panel Feature: As a part of collaborative suggestions enhancement, the support for Parameter History panel is deprecated from the software. For more information on collaborative suggestions feature, see Overview of Predictive Transformation.

Automatic random samples are disabled: In prior releases, a random sample of the data was automatically generated for display when a recipe with source data greater than 10MB was first loaded into the Transformer page. The Initial Sample, which is the first set of rows in the dataset was displayed by default, and this automatic random sample was available for manually selection if needed.

NOTE: Recent issues with long-running random sample jobs require that the generation of automatic random samples must be disabled until the issues are addressed.

You can still generate random samples manually. If sample generation takes too long, you can cancel it and select a different sampling type. For more information, see Samples Panel.

Classic Flow View no longer available: In Release 7.6, an improved version of Flow View was released. At the time of release, users could switch back to using the classic version. 

Beginning in this release, the classic version of Flow View is no longer available. 

Tip: The objects in your flows that were created in classic Flow View may be misaligned in the new version of Flow View. You can use auto-arrange to re-align your flow objects.

For more information, see Flow View Page.

Enhanced Flow and Flow View menu options: The context menu options for Flow View and Flow have been renamed and reorganized for better user experience.

Salesforce connector disabled temporarily: In Release 7.8, the Salesforce connector has been disabled temporarily. In a future release, it will be replaced with an improved version of the Salesforce connector.

Deprecated

None.

Known Issues

TD-55503: When you swap datasets via API, existing samples are not discarded. These samples are invalid.

Fixes

TD-53318: Cannot publish results to relational targets when flow name or output filename or table name contains a hyphen (e.g. my - filename.csv).


October 2, 2020

Release 7.6 push 3

Features

None.

Changes

Disabled Optimizer Service: In the September release, the Optimizer service was introduced, which enabled users to apply advanced physical and logical optimizations for flow and job executions. Recently, an issue was discovered, which has caused us to disable the service temporarily.

Deprecated

None.

Known Issues

None.

Fixes

None.


October 1, 2020

Release 7.6 push 3

Features

None.

Changes

Shared VPCs: Across all product editions, you can now run jobs through another project by specifying a full URL for the shared VPC.

Deprecated

None.

Known Issues

None.

Fixes

None.


September 21, 2020

Release 7.6

Features

Plans:

In-app messaging:  Be sure to check out the new in-app messaging feature, which allows us to share new features and relevant content to  users. More developments coming soon!

Project Settings:

MySQL:

Optimizer Service: The optimizer service optimizes query execution against data sources to minimize use of  resources, reduce compute costs, and improve overall job execution time.

Relational long-loading: For relational datasources that take time to load, you can continue to work while monitoring the loading process through the Import Data page, Flow View, or Dataset Details. For more information, see Overview of Job Monitoring.

Documentation:

Changes

None.

Deprecated

None.

Known Issues

None.

Fixes

TD-52559:  When publishing a single CSV file with headers using the append or overwrite publishing action, multiple instances of the header may be written in the output file.

TD-48915: Inserting special characters in an output filename results in a validation error in the the application and job failures.


August 4, 2020

Release 7.5

Features

New Flow View is now available:

Data quality rules: Introducing data quality rules, which enable you to define data quality checks specific to your dataset. For more information, see Overview of Data Quality.

Flow Sharing API

New functions:

Changes

Delete Table IAM permission no longer required

Add new IP address to whitelist

Deprecated

None.

Known Issues

TD-50942: If a flow is unshared with you, you cannot see or access the datasources for any jobs that you have already run on the flow. You can still access the job results.

Fixes

TD-49559:  Cannot select and apply custom data types through column Type menu.

TD-47473: Uploaded files (CSV, XLS) that contain a space in the filename fail to be converted.

TD-34840: Platform fails to provide suggestions for transformations when selecting keys from an object with many of them.

Earlier Releases

For release notes from previous releases, see Earlier Releases of Cloud Dataprep.