Page tree

 

use the CDH 6.1 bundle JARs

Please complete the following configuration steps in the Trifacta® platform.

NOTE: Integration with ADLS Gen2 is supported only on Azure Databricks.


Steps:

  1. You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.
  2. Enable ADLS Gen2 as the base storage layer:

    "webapp.storageProtocol": "abfss",
    "hdfs.enabled": false,
    "hdfs.protocolOverride": "",
    ParameterDescription
    webapp.storageProtocol

    Sets the base storage layer for the platform. Set this value to abfss.

    NOTE: After this parameter has been saved, you cannot modify it. You must re-install the platform to change it.

    hdfs.enabledFor ADLS Gen2 access, set this value to false.
    hdfs.protocolOverrideFor ADLS Gen2 access, this special parameter should be empty. It is ignored when the storage protocol is set to abfss.
  3. Configure ADLS Gen2 access mode. The following parameter must be set to system.

    "azure.adlsgen2.mode": "system",
  4. The platform must be configured to use the CDH 6.1 bundle JARs:

    NOTE: If you have integrated with Databricks Tables, do not overwrite the value for data-service.hiveJdbcJar with the following value, even if it's set to a different distribution JAR file.

    "hadoopBundleJar": "hadoop-deps/cdh-6.2/build/libs/cdh-6.2-bundle.jar",
    "spark-job-service.hiveDependenciesLocation": %(topOfTree)s/hadoop-deps/cdh-6.2/build/libs",
    "data-service.hiveJdbcJar": "hadoop-deps/cdh-6.2/build/libs/cdh-6.2-hive-jdbc.jar",
  5. Set the protocol whitelist and base URIs for ADLS Gen2:

    "fileStorage.whitelist": ["abfss"],
    "fileStorage.defaultBaseUris": ["abfss://filesystem@storageaccount.dfs.core.windows.net/"],
    ParameterDescription
    fileStorage.whitelist

    A comma-separated list of protocols that are permitted to read and write with ADLS Gen2 storage.

    NOTE: The protocol identifier "abfss" must be included in this list.

    fileStorage.defaultBaseUris

    For each supported protocol, this param must contain a top-level path to the location where Trifacta platform files can be stored. These files include uploads, samples, and temporary storage used during job execution.

    NOTE: A separate base URI is required for each supported protocol. You may only have one base URI for each protocol.

  6. Save your changes.
  7. The Java VFS service must be enabled for ADLS Gen2 access. For more information, see Configure Java VFS Service in the Configuration Guide.


This page has no comments.