This integration is currently blocked by the following third-party known issue: Cloudera Navigator OPSAPS-39589.
This functionality was last confirmed to behaving correctly in the following deployment only:
- Trifacta platform Release 4.0 and earlier
- Cloudera 5.8.x
- Cloudera Navigator 2.7.1
For more information, please contact Cloudera Support.
The Trifacta® platform can optionally publish metadata about recipe and jobs to Cloudera Navigator, which provides data governance over the Cloudera cluster. This section describes how to enable and configure this integration.
- Cloudera Navigator is an integrated data management solution for the Cloudera platform, providing security, governance, discovery, and analysis across diverse datasets in the cluster. For more information, see https://www.cloudera.com/resources/datasheet/cloudera-navigator-datasheet.html.
When this integration is enabled, recipe and job information can be published for all jobs executed through Pig.
NOTE: This method of publishing works only for jobs executed on Pig. It does not work for jobs executed on the Spark running environment.
When this feature is enabled, the following behaviors are applied to publishing:
- When a job completes on the Hadoop Pig running environment, the Trifacta platform automatically attempts to publish the link to the corresponding Trifacta job to Navigator.
- If the attempt is successful, there is no need to execute any additional publishing to Navigator.
- If the publishing fails or if you are trying to publish to Navigator a Trifacta job that predates enabling this feature, you can execute publication manually. See Export Results Window.
NOTE: The integration does not support publishing of Spark jobs to Navigator. This is a known issue.
- The Trifacta platform must be installed, configured, and integrated with an existing instance of the Cloudera platform. Please see the Cloudera Navigator documentation for additional details.
- Cloudera 5.8.x supported only.
- Cloudera Navigator 2.7.1 or later supported.
- The Trifacta node must have the Cloudera Manager port opened. The default port is
- You must have a Navigator user account with write permissions into the appropriate Navigator project.
To enable SSL use:
NOTE: CDH 5.8 is required for use with SSL with Cloudera Navigator.
- A Java keystore and a sample CA certificate must be created on the node hosting Cloudera Manager.
- A valid, self-signed certificate must be created on the node hosting Cloudera Manager.
- In the order listed, the above certificates must be imported into the Java keystore.
- Retain the server path and the passwords for the keystore and certificates.
- For more information, see the documentation that was provided for your Cloudera Manager release.
Enable Navigator Publish
Please complete the following steps to enable publication to Cloudera Navigator.
- You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json. For more information, see Platform Configuration Methods.
Edit the following properties:
Base URL of the Navigator instance where you are publishing.
NOTE: The port number must be specified as part of the
baseURL. Default value is
Namespace in Navigator where metadata is published.
When set to
true, publication to Navigator is enabled.
Username of the Navigator account to use to connect.
Password of the Navigator account
- If you are using HTTPS to connect to Cloudera Navigator, additional configuration is required. Otherwise, set
- Save the file.
Additional Configuration for SSL
To enable communication over SSL with Cloudera Navigator, please complete the following steps in Cloudera Manager and on the Trifacta node.
NOTE: Before you begin, you must create valid certificates and import them into the Java keystore in the node hosting Cloudera Manager.
- Launch Cloudera Manager.
- Select MGMT.
- Click Configuration.
- Click Scope > Navigator Metadata Server Category > Security.
- Set Enable TLS/SSL for Navigator Metadata Server to
Set TLS/SSL for Navigator Metadata Server to the path where the Java keystore was created. The following is the default path:
- Set TLS/SSL Keystore File Password to the password to the Java keystore.
- Set TLS/SSL Keystore Key Password to the password to the certificate.
- Restart the MGMT service.
- The JKS file that you created must be transferred to an accessible location on the Trifacta node.
- On the Trifacta node, edit
Edit the following properties:
Set this value to
Specify the path on the Trifacta node to the JKS file.
To enable SSL access, set this value to
- Change the
baseURLvalue to use HTTPS.
- Save the file and restart the platform. See Start and Stop the Platform.
- If you haven't done so already, r estart the platform to apply the configuration changes. See Start and Stop the Platform.
- Open a dataset in the Transformer Page.
- Click Run Job.
- Select Run with Hadoop.
- When the job completes, click the Export Results in the Job Results window.
- In the Export Results window, click Publish under Publish to Cloudera Navigator. See Export Results Window.
- A success message is displayed.
- Click the displayed links to verify the results inside Cloudera Navigator.
- You may need to provide a username and password.
- There may be a short delay in the results appearing in Cloudera Navigator.
Error - Requested data was not found: Pig job info for job '20' not found
When you attempt to publish through the Export Results window and receive this error, you have used a running environment other than Pig. Please re-execute the job using Hadoop Pig.
This page has no comments.