...
- Hadoop cluster: The Hadoop cluster should already be installed and operational. As part of the install preparation, you should have prepared the Hadoop platform for integration with the
. See Prepare Hadoop for Integration with the Platform.D s platform - For more information on the components supported in your Hadoop distribution, See Install Reference.
- Storage: on-premises, cloud, or hybrid.
- The
can interact with storage that is in the local environment, in the cloud, or in some combination. How your storage is deployed affects your configuration scenarios. See Storage Deployment Options.D s platform
- The
Base storage layer: You must configure one storage platform to be the base storage layer. Details are described later.
Info NOTE: Some deployments require that you select a specific base storage layer.
Warning After you have defined the base storage layer, it cannot be changed. Please review your Storage Deployment Options carefully. The required configuration is described later.
Hadoop versions
The
D s platform |
---|
Info | ||
---|---|---|
NOTE: The versions of your Hadoop software and the libraries in use by the
|
For more information, see System Requirements.
Platform configuration
After the
D s platform |
---|
D s config |
---|
...
For smaller datasets, the platform recommends using the
d-s-serverphoton |
---|
For larger datasets, if the size information is unavailable, the platform recommends by default that you run the job on the Hadoop cluster. For these jobs, the default publishing action for the job is specified to run on the Hadoop cluster, generating the output format defined by this parameter. Publishing actions, including output format, can always be changed as part of the job specification.
...
Acquire Hadoop cluster configuration files
Info | |||
---|---|---|---|
NOTE: If the
|
...
Info | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
NOTE: If these configuration files change in the Hadoop cluster, the versions installed on the
|
...
If you are using Hortonworks, you must complete the following modification to the site configuration file that is hosted on the
d-s-item | item |
---|
node |
Info |
---|
NOTE: Before you begin, you must acquire the full version and build number of your Hortonworks distribution. On any of the Hadoop nodes, navigate to |
...
Restart services. See Start and Stop the Platform.
Configure Snappy publication
If you are publishing using Snappy compression, you may need to perform the following additional configuration.
Steps:
Verify that the
snappy
andsnappy-devel
packages have been installed on the
. For more information, see https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/NativeLibraries.html.D s node From the
, execute the following command:D s node Code Block hadoop checknative
- The above command identifies where the native libraries are located on the
.D s node - Cloudera:
- On the cluster, locate the
libsnappy.so
file. Verify that this file has been installed on all nodes of the cluster, including the
. Retain the path to the file on theD s node
.D s node D s config Locate the
spark.props
configuration block. Insert the following properties and values inside the block:Code Block "spark.driver.extraLibraryPath": "/path/to/file", "spark.executor.extraLibraryPath": "/path/to/file",
- On the cluster, locate the
- Hortonworks:
Verify on the
that the following locations are available:D s node Info NOTE: The asterisk below is a wildcard. Please collect the entire path of both values.
Code Block /hadoop-client/lib/snappy*.jar /hadoop-client/lib/native/
D s config Locate the
spark.props
configuration block. Insert the following properties and values inside the block:Code Block "spark.driver.extraLibraryPath": "/hadoop-client/lib/snappy*.jar;/hadoop-client/lib/native/", "spark.executor.extraLibraryPath": "/hadoop-client/lib/snappy*.jar;/hadoop-client/lib/native/",
- Save your changes and restart the platform.
- Verify that the
/tmp
directory has the proper permissions for publication. For more information, see Supported File Formats.
Debugging
You can review system services and download log files through the
D s webapp |
---|
...