Page tree

   

The Trifacta platform can be configured to integrate with supported versions of Azure Databricks clusters to run jobs in Spark. 

NOTE: Before you attempt to integrate, you should review the limitations around this integration. For more information, see Configure for Azure Databricks.

Steps:

  1. You can apply this change through the Admin Settings Page (recommended) or

    trifacta-conf.json
    . For more information, see Platform Configuration Methods.

  2. Configure the following parameters to enable job execution on the specified Azure Databricks cluster:

    "webapp.runInDatabricks": true,
    "webapp.runWithSparkSubmit": false,
    ParameterDescription
    webapp.runInDatabricks

    Defines if the platform runs jobs in Azure Databricks. Set this value to true.

    webapp.runWithSparkSubmitFor all Azure Databricks deployments, this value should be set to false.
  3. Configure the following Azure Databricks-specific parameters:

    "databricks.serviceUrl": "<url_to_databricks_service>",
    ParameterDescription
    databricks.serviceUrlURL to the Azure Databricks Service where Spark jobs will be run (Example: https://westus2.azuredatabricks.net)

    NOTE: If you are using instance pooling on the cluster, additional configuration is required. See Configure for Azure Databricks.

  4. Save your changes and restart the platform.


This page has no comments.