Configure ResourceManager settings
Configure the following:
NOTE: Do not modify the other host/port settings unless you have specific information requiring the modifications.
For more information, see System Ports in the Planning Guide.
Specify distribution client bundle
The Designer Cloud powered by Trifacta platform ships with client bundles supporting a number of major Hadoop distributions. You must configure the jarfile for the distribution to use. These distributions are stored in the following directory:
Configure the bundle distribution property (
hadoopBundleJar) in platform configuration. Examples:
|Cloudera Data Platform|
x.y is the major-minor build number (e.g. 5.4)
NOTE: The path must be specified relative to the install directory.
Tip: If there is no bundle for the distribution you need, you might try the one that is the closest match in terms of Apache Hadoop baseline. For example, CDH5 is based on Apache 2.3.0, so that client bundle will probably run ok against a vanilla Apache Hadoop 2.3.0 installation. For more information, see Alteryx Support.
Some additional configuration is required. See Configure for Cloudera in the Configuration Guide.
Default Hadoop job results format
For smaller datasets, the platform recommends using the Trifacta Photon running environment.
For larger datasets, if the size information is unavailable, the platform recommends by default that you run the job on the Hadoop cluster. For these jobs, the default publishing action for the job is specified to run on the Hadoop cluster, generating the output format defined by this parameter. Publishing actions, including output format, can always be changed as part of the job specification.
For more information, see Run Job Page in the User Guide.
This page has no comments.