Please complete the following steps in the listed order to configure your installed instance of the to integrate with an HDInsight cluster.
Deploy HDI cluster and .
NOTE: The HDI cluster can be deployed as part of installation from the Marketplace. You can also integrate the platform with a pre-existing cluster. Details are below.
For more information, see Install from Azure Marketplace.
Create registered application
You must create a Azure Active Directory (AAD) application and grant it the desired access permissions, such as read/write access to the ADLS resource and read/write access to the Azure Key Vault secrets . This service principal is used by the for access to all Azure resources. For more information, see https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-create-service-principal-portal.
After you have registered, acquire the following information:
These properties are applied later in the configuration process.
Configure the Platform
Configure for HDI
If you are integrating the with a pre-existing HDI cluster, additional configuration is required. See Configure for HDInsight.
Configure base storage layer
For Azure installations, you can set your base storage layer to be HDFS or WASB.
Configure for Key Vault
For authentication purposes, the must be integrated with an Azure Key Vault keystore. For more information, see https://azure.microsoft.com/en-us/services/key-vault/.
Please complete the following sections to create and configure your Azure Key Vault.
Create a Key Vault resource in Azure
Enable Key Vault access for the
In the Azure portal, you must assign access policies for application principal of the to access the Key Vault.
Create WASB access token
If you are enabling access to WASB, you must create this token within the Azure Portal.
For more information, see https://docs.microsoft.com/en-us/rest/api/storageservices/delegating-access-with-a-shared-access-signature.
Configure Key Vault key and secret for WASB
In the Key Vault, you can create key and secret pairs for use.
WASB: To enable access to the Key Vault, you must specify your key and secret values as follows:
Acquire shared access signature value:
In the Azure portal, please do the following:
Create a custom key:
To create a custom key and secret pair for WASB use by the , please complete the following steps:
Configure Key Vault location
For ADLS or WASB, the location of the Azure Key Vault must be specified for the . The location can be found in the properties section of the Key Vault resource in the Azure portal.
This value is the location for the Key Vault. It must be applied in the .
Apply SAS token identifier for WASB
If you are using WASB as your base storage layer, you must apply the SAS token value into the configuration of the .
Configure Secure Token Service
Access to the Key Vault requires use of the secure token service (STS) from the . To use STS with Azure, the following properties must be specified.
Configure for SSO
If needed, you can integrate the with Azure AD for Single-Sign On to the platform. See Configure SSO for Azure AD.
Configure for ADLS
Enable read-only or read-write access to ADLS. For more information, see Enable ADLS Access.
Configure for WASB
Enable read-only or read-write access to WASB. For more information on integrating with WASB, see Enable WASB Access.
Configure relational connections
If you are integrating with relational datastores, please complete the following configuration sections.
Create encryption key file
An encryption key file must be created on the . This key file is shared across all relational connections. See Create Encryption Key File.
Create Hive connection
You can create a connection to the Hive instance on the HDI cluster with some modifications.
Natively, Azure supports high availability for HiveServer2 via Zookeeper. As a result, host and port information in the JDBC URL must be replaced with a Zookeeper quorum.
In addition to the other Hive connection properties, please specify the following values for the properties listed below:
Connections are created through the Connections page. See Connections Page.
For additional details on creating a conection to Hive, see Create Hive Connections.
A Hive connection can also be created using the above property substitutions via CLI or API.
Create Azure SQL DB connection
For more information, see Create SQL DB Connections.
Create Azure SQL DW connection
For more information, see Create SQL DW Connections.
Workaround for missing Python packages
After installation, the supervisord process may complain about some Python packages that are "missing."
These packages are present but lack the appropriate permissions. To enable the packages for use, please run the following on the :