Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r0711

D toc

Release 6.0.2

This release addresses several bug fixes.

What's New

Changes to System Behavior

Info

NOTE: As of Release 6.0, all new and existing customers must license, download, and install the latest version of the Tableau SDK onto the

D s node
. For more information, see Create Tableau Server Connections.

Upload:

  • In previous releases, files that were uploaded to the 
    D s platform
     that had an unsupported filename extension received a warning before upload. 
  • Beginning in this release, files with unsupported extensions are blocked from upload. 
  • You can change the list of supported file extensions. For more information, see Miscellaneous Configuration

Documentation:

  • In Release 6.0.x documentation, documentation for the API JobGroups Get Status v4 endpoint was mistakenly published. This endpoint does not exist. For more information on the v4 equivalent, see Changes to the APIs.

Key Bug Fixes

TicketDescription
TD-40471SAM auth: Logout functionality not working
TD-39318Spark job fails with parameterized datasets sourced from Parquet files
TD-39213Publishing to Hive table fails

New Known Issues

None.

Release 6.0.1

This release features support for several new Hadoop distributions and numerous bug fixes.

What's New

Connectivity:

  • Support for integration with CDH 5.16.

  • Support for integration with CDH 6.1. Version-specific configuration is required.

    Info

    NOTE: If you have upgraded to Cloudera 6.0.0 or later and are using EC2 role-based authentication to access AWS resources, you must change two platform configuration properties. For more information, see Configure for EC2 Role-Based Authentication.

    See Supported Deployment Scenarios for Cloudera.

  • Support for integration with HDP 3.1. Version-specific configuration is required. See Supported Deployment Scenarios for Hortonworks.
    • Support for Hive 3.0 on HDP 3.0 or HDP 3.1. Version-specific configuration is required. See Configure for Hive.
  • Support for Spark 2.4.0.

    Info

    NOTE: There are some restrictions around which running environment distributions support and do not support Spark 2.4.0.

    For more information, see Configure for Spark.

  • Support for integration with high availability for Hive.

    Info

    NOTE: High availability for Hive is supported on HDP 2.6 and HDP 3.0 with Hive 2.x enabled. Other configurations are not currently supported.


    For more information, see Create Hive Connections.

Publishing:

API:

Changes to System Behavior

Photon

In the application and documentation, the following changes have been applied.

ReferenceDescriptionold Run Job Page termnew Run Job Page termDoc
HadoopSupported running environment on the Hadoop clusterRun on HadoopSparkConfigure for Spark
Photon running environment

Supported running environment on the

D s node

Trifacta ServerPhotonConfigure Photon Running Environment
Photon in-browser clientIn-browser web clientn/an/aConfigure Photon Client

Key Bug Fixes

TicketDescription
TD-39779

MySQL JARs must be downloaded by user.

Info

NOTE: If you are installing the databases in MySQL, you must download a set of JARs and install them on the

D s node
. For more information, see Install Databases for MySQL.

TD-39694Tricheck returns status code 200, but there is no response. It does not work through Admin Settings page.
TD-39455

HDI 3.6 is not compatible with Guava 26.

TD-39086

Hive ingest job fails on Microsoft Azure.

New Known Issues

TicketDescription
TD-40299Cloudera Navigator integration cannot locate the database name for JDBC sources on Hive.
TD-40348

When loading a recipe in imported flow that references an imported Excel dataset, Transformer page displays Input validation failed: (Cannot read property 'filter' of undefined) error, and the screen is blank. 

Tip

Workaround: In Flow View, select an output object, and run a job. Then, load the recipe in the Transformer page and generate a new sample. For more information, see Import Flow.

TD-39969

On import, some Parquet files cannot be previewed and result in a blank screen in the Transformer page.

Tip

Workaround: Parquet format supports row groups, which define the size of data chunks that can be ingested. If row group size is greater than 10 MB in a Parquet source, preview and initial sampling does not work. To workaround this issue, import the dataset and create a recipe for it. In the Transformer page, generate a new sample for it. For more information, see Parquet Data Type Conversions.

Release 6.0

This release of

D s product
rtrue
introduces key features around column management, including multi-select and copy and paste of columns and column values. A new Job Details page captures more detailed information about job execution and enables more detailed monitoring of in-progress jobs. Some relational connections now support publishing to connected databases. This is our largest release yet. Enjoy!

Info

NOTE: This release also announces the deprecation of several features, versions, and supported extensions. Please be sure to review Changes to System Behavior below.

What's New

Info

NOTE: The PNaCl client for Google Chrome has been replaced by the WebAssembly client. This new client is now the default in use by the platform and is deployed to all clients through the browser. Please verify that all users in your environment are on Google Chrome 68+. For more information, see Desktop Requirements.


Info

NOTE: Beginning in this release, the

D s deskapp
rtrue
requires a 64-bit version of Microsoft Windows. For more information, see Install Desktop Application.


Wrangling:

  • In data grid, you can select multiple columns before receiving suggestions and performing transformations on them. For more information, see Data Grid Panel.
    • New Selection Details panel enables selection of values and groups of values within a selected column. See Selection Details Panel.
  • Copy and paste columns and column values through the column menus. see Copy and Paste Columns.
  • Support for importing files in Parquet format. See Supported File Formats.
  • Specify ranges of key values in your joins. See Configure Range Join.

Jobs:

  • Review details and monitor the status of in-progress jobs through the new Job Details page. See Job Details Page
  • Filter list of jobs by source of job execution or by date range. See Jobs Page.

Connectivity:

  • Hive integration is now available when the backend datastore is S3. See Configure for Hive.
Language:

  • Track file-based lineage using $filepath and $sourcerownumber references. See Source Metadata References.

  • In addition to directly imported files, the $sourcerownumber reference now works for converted files (such as Microsoft Excel workbooks) and for datasets with parameters. See Source Metadata References.

Workspace:

  • Organize your flows into folders. See Flows Page.

Publishing:

  • Users can be permitted to append to Hive tables when they do not have CREATE or DROP permissions on the schema. 

    Info

    NOTE: This feature must be enabled. See Configure for Hive.

Administration:

  • New Workspace Settings page centralizes many of the most common admin settings. See Changes to System Behavior below.

  • Download system logs through the 
    D s webapp
    . See Admin Settings Page.

Supportability:

Authentication:

  • Integrate SSO authentication with enterprise LDAP-AD using platform-native LDAP support. 

    D beta

    Info

    NOTE: In previous releases, LDAP-AD SSO utilizes an Apache reverse proxy. While this method is still supported, it is likely to be deprecated in a future release. Please migrate to using the above SSO method. See Configure SSO for AD-LDAP.

  • Support for SAML SSO authentication. See Configure SSO for SAML.

  • Support for per-user authentication for AWS resources. See Configure for AWS.
  • Support for Azure Databricks SSO/OAuth. 

    Info

    NOTE: If you integrate the platform with an Azure Databricks cluster and enable SSO for Azure, Azure Databricks is managed through SSO seamlessly. For more information, see Configure SSO for Azure AD.

API:

  • Manage user access to APIs using renewable access tokens. For more information, see Changes to the APIs.

Changes to System Behavior

Info

NOTE: The

D s node
requires NodeJS 10.13.0. See System Requirements.

Configuration:

To simplify configuration of the most common feature enablement settings, some settings have been migrated to the new Workspace Settings page. For more information, see Workspace Settings Page.

Info

NOTE: Over subsequent releases, more settings will be migrated to the Workspace Settings page from the Admin Settings page and from

D s triconf
. For more information, see Changes to Configuration.

See Platform Configuration Methods.

See Admin Settings Page.

API:

Info

NOTE: In the next release of

D s product
productee
, the v3 version of the APIs will be removed from the product. These End of Life endpoints will no longer be available for interaction with the
D s platform
. You must migrate your usage to the v4 APIs. For more information, see Changes to the APIs.

CLI:

Info

NOTE: The

D s item
itemcommand line interface
uses the v3 endpoints. In the next release of
D s product
productee
, the
D s item
itemCLI
will reach its End of Life. These tools will no longer be provided with the software distribution at all. You must migrate your use of the CLI to use the v4 APIs.

Java 7:

Info

NOTE: In the next release of

D s product
productee
, support for Java 7 will be end of life. The product will no longer be able to use Java 7 at all. Please upgrade to Java 8 on the
D s node
and your Hadoop cluster.

Changes to release numbering system:

In Release 5.0 and earlier, each release of 

D s product
productee
 was given a separate release number, each release incrementing that number. For example, the Release 4.x product line was numbered Release 4.0, Release 4.1, and Release 4.2.

In Release 5.1, 

D s company
 moved to a monthly milestone release process. Monthly milestones were given separate release numbers in the following format: Release 5.1m1, Release 5.1m2, Release 5.1m3, and Release 5.1m4. The fifth milestone was the generally available release for Release 5.1.

Beginning in this release, each monthly milestone receives a separate release number. For this release, milestones are: Release 5.6, Release 5.7, and Release 5.8. Release 5.9 is the generally available release for 

D s product
productee

This change in numbering scheme does not affect the scope and frequency of 

D s product
productee
 releases. 

Errata:

In prior releases, the product and documentation stated that the platform implemented a version of regular expressions based on Javascript syntax. This is incorrect.

The 

D s platform
 implements a version of regular expressions based off of RE2 and PCRE regular expressions.

Info

NOTE: This is not a change in behavior. Only the documentation has been changed.

Key Bug Fixes

TicketDescription
TD-36332Data grid can display wrong results if a sample is collected and dataset is unioned.
TD-36192Canceling a step in recipe panel can result in column menus disappearing in the data grid.
TD-35916Cannot logout via SSO
TD-35899A deployment user can see all deployments in the instance.
TD-35780Upgrade: Duplicate metadata in separate publications causes DB migration failure.

TD-35644

Extractpatterns with "HTTP Query strings" option doesn't work.
TD-35504
Cancel job throws 405 status code error. Clicking Yes repeatedly pops up Cancel Job dialog.
TD-35486Spark jobs fail on LCM function that uses negative numbers as inputs.
TD-35483

Differences in how WEEKNUM function is calculated in the

D s photon
and Spark running environments, due to the underlying frameworks on which the environments are created.

Info

NOTE:

D s photon
and Spark jobs now behave consistently. Week 1 of the year is the week that contains January 1.

For more information, see Changes to the Language.

TD-35481

Upgrade Script is malformed due to SplitRows not having a Load parent transform.
TD-35177Login screen pops up repeatedly when access permission is denied for a connection. 
TD-27933

For multi-file imports lacking a newline in the final record of a file, this final record may be merged with the first one in the next file and then dropped in the

D s photon
running environment. 

 

New Known Issues

TicketDescription
TD-39513

Import of folder of Excel files as parameterized dataset only imports the first file, and sampling may fail.

Tip

Workaround: Import as separate datasets and union together.

TD-39455

HDI 3.6 is not compatible with Guava 26.

Tip

Workaround: HDI 3.6 supports Guava 14. The solution is to remove the Guava 26 file from the Data Service class path. For more information, see Troubleshooting in Configure for HDInsight.


TD-39092

$filepath and $sourcerownumber references are not supported for Parquet file inputs.

Tip

Workaround: Upload your Parquet files. Create an empty recipe and run a job to generate an output in a different file format, such as CSV or JSON. Use that output as a new dataset. See Build Sequence of Datasets.

For more information on these references, see Source Metadata References.

TD-39086

Hive ingest job fails on Microsoft Azure.

TD-39053

Cannot read datasets from Parquet files generated by Spark containing nested values.

Tip

Workaround: In the source for the job, change the data types of the affected columns to String and re-run the job on Spark.

TD-39052Signout using reverse proxy method of SSO is not working after upgrade.
TD-38869

Upload of Parquet files does not support nested values, which appear as null values in the Transformer page.

Tip

Workaround: Unnest the values before importing into the platform.

TD-37683

Send a copy does not create independent sets of recipes and datasets in new flow. If imported datasets are removed in the source flow, they disappear from the sent version.

Tip

Workaround: Create new versions of the imported datasets in the sent flow.

TD-36145

Spark running environment recognizes numeric values preceded by + as Integer or Decimal data type.

D s photon
running environment does not and types these values as strings.

TD-35867

v3 publishing API fails when publishing to alternate S3 buckets

Tip

Workaround: You can use the corresponding v4 API to perform these publication tasks. For more information on a workflow, see API Workflow - Manage Outputs.