Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r0682

use the CDH 6.1 bundle JARs

Excerpt

Please complete the following configuration steps in the

D s platform
rtrue
.

Info

NOTE: Integration with ADLS Gen2 is supported only on Azure Databricks.


Steps:

  1. D s config
  2. Enable ADLS Gen2 as the base storage layer:

    Code Block
    "webapp.storageProtocol": "abfss",
    "hdfs.enabled": false,
    "hdfs.protocolOverride": "",
    ParameterDescription
    webapp.storageProtocol

    Sets the base storage layer for the platform. Set this value to abfss.

    Info

    NOTE: After this parameter has been saved, you cannot modify it. You must re-install the platform to change it.

    hdfs.enabledFor ADLS Gen2 access, set this value to false.
    hdfs.protocolOverrideFor ADLS Gen2 access, this special parameter should be empty. It is ignored when the storage protocol is set to abfss.
  3. Configure ADLS Gen2 access mode. The following parameter must be set to system.

    Code Block
    "azure.adlsgen2.mode": "system",
  4. The platform must be configured to use the CDH 6.1 bundle JARs:

    Info

    NOTE: If you have integrated with Databricks Tables, do not overwrite the value for data-service.hiveJdbcJar with the following value, even if it's set to a different distribution JAR file.

    Code Block
    "hadoopBundleJar": "hadoop-deps/cdh-6.2/build/libs/cdh-6.2-bundle.jar",
    "spark-job-service.hiveDependenciesLocation": %(topOfTree)s/hadoop-deps/cdh-6.2/build/libs",
    "data-service.hiveJdbcJar": "hadoop-deps/cdh-6.2/build/libs/cdh-6.2-hive-jdbc.jar",
  5. Set the protocol whitelist and base URIs for ADLS Gen2:

    Code Block
    "fileStorage.whitelist": ["abfss"],
    "fileStorage.defaultBaseUris": ["abfss://filesystem@storageaccount.dfs.core.windows.net/"],
    ParameterDescription
    fileStorage.whitelist

    A comma-separated list of protocols that are permitted to read and write with ADLS Gen2 storage.

    Info

    NOTE: The protocol identifier "abfss" must be included in this list.

    fileStorage.defaultBaseUris

    For each supported protocol, this param must contain a top-level path to the location where

    D s platform
     files can be stored. These files include uploads, samples, and temporary storage used during job execution.

    Info

    NOTE: A separate base URI is required for each supported protocol. You may only have one base URI for each protocol.

  6. Save your changes.
  7. The Java VFS service must be enabled for ADLS Gen2 access. For more information, see Configure Java VFS Service in the Configuration Guide.

...