...
Azure Storage Layer | Description | Required for |
| ||||
---|---|---|---|---|---|---|---|
Azure Storage | Azure storage leverages WASB, an astraction layer on top of HDFS. |
NOTE: WASBS is recommended. | |||||
Data Lake Store | Data Lake Store maps to ADLS in the
|
| hdfs |
...
D s config Configure Batch Job Runner:
Code Block "batch-job-runner": { "autoRestart": true, ... "classpath": "%(topOfTree)s/hadoop-data/build/install/hadoop-data/hadoop-data.jar:%(topOfTree)s/hadoop-data/build/install/hadoop-data/lib/*:%(topOfTree)s/conf/hadoop-site:/usr/hdp/current/hadoop-client/hadoop-azure.jar:/usr/hdp/current/hadoop-client/lib/azure-storage-2.2.0.jar" },
Configure Diagnostic Server:
Code Block "diagnostic-serverjob-runner": { "autoRestart": true, ... "classpath": "%(topOfTree)s/apps/diagnostichadoop-serverdata/build/libs/diagnostic-serverinstall/hadoop-data/hadoop-data.jar:%(topOfTree)s/apps/diagnostichadoop-serverdata/build/dependenciesinstall/hadoop-data/lib/*:%(topOfTree)s/conf/hadoop-site:/usr/hdp/current/hadoop-client/hadoop-azure.jar:/usr/hdp/current/hadoop-client/lib/azure-storage-2.2.0.jar" },
Configure the following environment variables:
Code Block "env.PATH": "${HOME}/bin:$PATH:/usr/local/bin:/usr/lib/zookeeper/bin", "env.TRIFACTA_CONF": "/opt/trifacta/conf" "env.JAVA_HOME": "/usr/lib/jvm/java-1.8.0-openjdk-amd64",
Configure the following properties for various
:D s item components components Code Block "ml-service": { "autoRestart": true }, "monitor": { "autoRestart": true, ... "port": <your_cluster_monitor_port> }, "proxy": { "autoRestart": true }, "udf-service": { "autoRestart": true }, "webapp": { "autoRestart": true },
Disable S3 access:
Code Block "aws.s3.enabled": false,
Configure the following Spark Job Service properties:
Code Block "spark-job-service.classpath": "%(topOfTree)s/services/spark-job-server/server/build/libs/spark-job-server-bundle.jar:%(topOfTree)s/conf/hadoop-site/:%(topOfTree)s/services/spark-job-server/build/bundle/*:/usr/hdp/current/hadoop-client/hadoop-azure.jar:/usr/hdp/current/hadoop-client/lib/azure-storage-2.2.0.jar", "spark-job-service.env.SPARK_DIST_CLASSPATH": "/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-mapreduce-client/*",
- Save your changes.
...
Hive integration requires additional configuration.
Info |
---|
NOTE: Natively, HDI supports high availability for Hive via a Zookeeper quorum. |
For more information, see see Configure for Hive.
Configure for Spark Profiling
...