This release contains numerous bug fixes and some interesting new features.
Admin, Install, & Config:
Support for CDH 5.9. See Supported Deployment Scenarios for Cloudera.
NOTE: Support for CDH 5.5/CDH 5.6 has been deprecated. Please upgrade to CDH 5.8 or later.
Changes to the Language:
output_path is now a required parameter for commands that use it.
NOTE: When specifying publishing options in the CLI, you may specify one file format only for the output.
|TD-19404||Split transform using at parameter values out of range of cell size generates an error in Pig.|
|TD-19150||On Photon, |
|TD-19032||Swapping rapidly between source datasets that have already been edited may cause a |
|TD-18933||You cannot load a dataset that utilizes another dataset via join or union three levels deep.|
|TD-18268||If you profile a wide column (one that contains many characters of data in each cell value), the machine learning service can crash.|
|TD-18093||Changes to a dataset that generates new columns can break any downstream lookups that use the dataset.|
Publish to Redshift of single-file CSV or JSON files fails.
After upgrade, job card summaries in the Jobs page may fail to load for jobs executed in the pre-upgrade version with steps containing functions that have been renamed.
When publishing to S3, you cannot write to a single file in an
When switching between an
You cannot configure a publishing location to be a directory that does not already exist.
User are permitted to select compressed formats for
Job execution fails with
Column browser does not recognize when you place a checkmark next to the last column in the list.
Preview cards take a long time to load when selecting values from a Datetime column.
This release features the introduction of the following key features:
A completely redesigned execution engine (codename: Photon), which enables much better performance across larger samples in the Transformer page and faster execution on the .
NOTE: To interact with the Photon running environment, all desktop instances of Google Chrome must have the PNaCl component enabled and updated to the minimum supported version. See Desktop Requirements.
NOTE: If you are upgrading from Release 3.1.x, you must manually enable the Photon running environment. If you are upgrading from an earlier version or installing Release 3.2 or later, the Photon running environment is enabled by default. See Configure Photon Running Environment.
Details are below.
Redesigned object model and related changes to the enable greater flexibility in asset reuse in current and future releases.
NOTE: Beginning in Release 3.2, the is transitioning to an enhanced object model, which is designed to support greater re-usability of objects and improved operationalization. This new object model and its related features will be introduced over multiple releases. For more information, see Changes to the Object Model.
Explore automatically detected string patterns in column data using pattern profiling and build transforms based on these patterns. See Column Details Panel.
Admin, Install, & Config:
NOTE: The minimum system requirements for the have changed for this release. For more information, see System Requirements.
Support for CDH 5.8 core and with security. See Supported Deployment Scenarios for Cloudera.
NOTE: Support for CDH 5.3/CDH 5.4 has been deprecated. Please upgrade to CDH 5.8 or later.
Command Line Interface:
Job Execution and Performance:
Superior performance in job execution. Run jobs on the on much larger datasets and faster rate.
New Batch Job Runner service simplifies job monitoring and improves performance.
NOTE: The Batch Job Runner service requires a separate database for tracking jobs. New and existing customers must manually install this database. See Install the Databases.
This section outlines changes to how the platform behaves that have resulted from features or bug fixes in Release 3.2.
NOTE: Due to changes in system behavior, all existing random samples for a dataset are no longer available after upgrading to this release. For any upgraded dataset, the selected sample reverts to the default sample, the first N rows of the dataset. The number of rows in the sample depends on the number of columns, data density, and other factors.
When you load your dataset into the Transformer page for the first time:
The first N rows of the dataset is selected as a sample.
NOTE: The first N rows sample may change the data that is displayed in the data grid. In some cases, the data grid may initially display no data at all.
A new random sample is automatically generated for you.
multisplittransform has been replaced by a more flexible version of the
splittransform. For more information, see Split Transform.
|TD-18319||Inconsistent results for |
|TD-16086||Job list drop-down fails to enable selection of correct jobs.|
|TD-16084||Job cards display |
|TD-15609||Column filtering only works if filtering value is entered in lowercase.|
Attempt to publish to Cloudera Navigator for a job results in a DataNotFoundException.
|TD-15330||Pivot transform generates "Cannot read property 'primitive' of undefined" error.|
|TD-14541||Names for private connections can collide with names of global connections, resulting in private connection unable to be edited by the owning user.|
|TD-14397||Left or outer join against dataset with |
|TD-13162||Join key selection screen and buttons are not accessible on a small desktop screen.|
Swapping rapidly between source datasets that have already been edited may cause a
You cannot load a dataset that utilizes another dataset via join or union three levels deep.
Example: three datasets (
When Photon is enabled, previews in the data grid may take up to 30 seconds to dismiss.
Platform fails to start if for S3 access does not have the ListAllMyBuckets permission.
In Release 3.1.2 and earlier, any datasource that has never been used to create a dataset is no longer available after upgrade.
If you profile a wide column (one that contains many characters of data in each cell value), the machine learning service can crash.
|TD-18093||Transformer Page - Tools|
Changes to a dataset that generates new columns can break any downstream lookups that use the dataset.
Preview of Hive tables intermittently fails to show table data. When you click the Eye icon to preview Hive table data, you might see a spinner icon.
Remove references to Zookeeper in the platform.
|TD-16419||Transform Builder||Comparison functions added through Builder are changed to operators in recipe|
Importing a directory of Avro files only imports the first file when the Photon running environment is enabled.
Python and Java UDFs accept inputs with zero parameters.
Platform cannot execute jobs on Pig that are sourced from S3, if OpenJDK is installed.