...
This section contains hardware and software requirements for successful installation of of
D s product | ||
---|---|---|
|
Platform Node Requirements
Node Installation Requirements
...
Info | |
---|---|
NOTE: If the
|
...
is installed in a Hadoop environment, the software must be installed on an edge node of the cluster. |
- If it is integrated with a Cloudera cluster, it must be installed on a gateway node that is managed by Cloudera Manager
- If it is integrated with a Hortonworks cluster, it must be installed on an Ambari/Hadoop client that is managed by Hortonworks Ambari.
- If it is integrated with an HDI cluster, it must be installed on an edge node.
- Customers who originally installed an earlier version on a non-edge node will still be supported. If the software is not installed on an edge node, you may be required to copy over files from the cluster and to synchronize these files after upgrades. The cluster upgrade process is more complicated.
- This requirement does not apply to the following cluster integrations:
- AWS EMR
- Azure Databricks
...
Hardware Requirements
Minimum hardware:
Item | Required |
---|---|
Number of cores | 8 cores |
RAM | 64 GB
| ||
Disk space to install software | 4 GB | ||
Total free disk space | 16 GB Space requirements by volume:
|
Recommended hardware:
Item | Recommended |
---|---|
Number of cores | 16 cores |
RAM | 128 GB
| ||
Disk space to install software | 16 GB | ||
Total free disk space | 100 GB Space requirements by volume:
|
Operating System Requirements
The following operating systems are following operating systems are supported for the the
D s node |
---|
Info | |
---|---|
NOTE: The
|
...
requires 64-bit versions of any supported operating system. |
CentOS/RHEL versions:
- CentOS 6.4 - 6.x, 7.1, 7.2, 7.4 - 7.6
RHEL 6.4 - 6.x, 7.1, 7.2, 7.4 - 7.6
...
Info |
---|
NOTE: If you are installing on CentOS/RHEL 7.1, you must be connected to an online repository for some critical updates. Offline installation is not supported for these operating system distributions. |
Info |
---|
NOTE: For security reasons, RHEL 7.3 is not supported for installation of Release 5.0 or later |
...
of the
|
Info |
---|
NOTE: Installation on CentOS/RHEL versions 7.4 or earlier |
...
requires an upgrade of the RPM software on |
...
the
|
Tip |
---|
Tip: Disabling SELinux on |
...
the
|
...
is recommended. However, if security policies require it, you may need to apply some changes to the environment. |
Ubuntu
...
Ubuntu 14.04 (codename Trusty) and 16.04 (codename Xenial)
Notes on Ubuntu installation:
Info NOTE: For Ubuntu installations, some packages must be manually installed. Instructions are provided later in the process.
For more information on RPM dependencies, see see System Dependencies.
Database Requirements
The following database versions are supported by the the
D s platform |
---|
D s lang |
---|
Info | |
---|---|
NOTE: One of these supported versions must be installed on the
|
Supported database versions:
- PostgreSQL 9.6
MySQL 5.7 Community
Notes on database versions:
- MySQL 5.7 is not supported for installation in Amazon RDS.
theInfo NOTE: If you are installing the databases into MySQL, you must download and install the MySQL Java driver onto
seethe
. For more information,D s node
inin the Databases Guide.
Info NOTE: MySQL 5.7 is not supported for installation in Amazon RDS.
Info |
---|
NOTE: H2 database type is used for internal testing. It is not a supported database. |
For more information on installing and configuring the database, see see Install Databases in the Databases Guide.
...
The following software components must be present.
Java
Tip |
---|
Tip: Where possible, you should install the same version of Java on |
...
the
|
...
and on the cluster with which you are integrating. |
Java 1.8
Notes on Java versions:
Info NOTE: There are additional requirements related to Java JDK listed in the Hadoop Components section listed below.
theInfo NOTE: OpenJDK 1.8 is officially supported. It is installed on
duringthe
D s node during the installation process.
- There are additional requirements related to Java JDK listed in the Hadoop Components section listed below.
yourInfo NOTE: If you are integrating
withyour
D s item item instance
thewith S3, you must install the Oracle JRE 1.8 onto
seethe
. No other version of Java is supported for S3 integration. For more information,D s node
insee Enable S3 Access
in the Configuration Guide.
Other Software
Info |
---|
NOTE: For Ubuntu installations, the following packages must be manually installed using Ubuntu-specific versions |
...
. Instructions and version numbers are provided later in the process. |
- NginX 1.12.2
- NodeJS 10.13.0
Instructions and version numbers are provided later in the process.
Root User Access
Installation must be executed as the root user on the the
D s node |
---|
SSL Access
(Optional) If users are connecting to the the
D s platform |
---|
...
(Optional) Internet access is not required for installation or operation of the platform. However, if the server does not have Internet access, you must acquire additional software as part of the disconnected install. For more information, see see Install Dependencies without Internet Access in the Install Guide.
...
The following requirements apply if you are integrating the the
D s platform |
---|
Info |
---|
NOTE: For general guidelines on sizing the cluster, |
...
see Sizing Guidelines. |
Info |
---|
NOTE: If you have upgrades to the Hadoop cluster planned for the next year, you should review those plans with Support prior to installation. For more information, please |
...
contact
|
Supported Hadoop Distributions
The
D s platform |
---|
...
Info | |
---|---|
NOTE: The
|
...
only supports the latest major release and its minor releases of each distribution. |
...
The
|
...
only supports the versions of any required components included in a supported distribution. Even if they are upgraded components, use of non-default versions of required components is not supported. |
The
D s platform |
---|
Cloudera supported distributions
CDH 6.3 2 Recommended
CDH 6.2 1
CDH 6.10
Info NOTE: CDH 6.x requires that you use the native Spark libraries provided by the cluster. Additional configuration is required. For more information, see Configure for Spark in the Configuration Guide.
- CDH 5.16 Recommended
See See Supported Deployment Scenarios for Cloudera in the Install Guide.
...
HDP 3.1 Recommended
HDP 3.0
Info NOTE: HDP 3.x requires that you use the native Spark libraries provided by the cluster. Additional configuration is required. For more information, see Configure for Spark in the Configuration Guide.
- HDP 2.6
See Supported Deployment Scenarios for Hortonworks in the Install Guide.
EMR supported distributions
See Configure for EMR in the Configuration Guide.
HDInsight supported distributions
See Configure for HDInsight in the Configuration Guide.
Azure Databricks supported distributions
See Configure for Azure Databricks in the Configuration Guide.
Node Requirements
Each cluster node must have the following software:
Java JDK 1.8 (some exceptions may be listed below)
Hadoop Component Access
The The
D s item | ||
---|---|---|
|
...
The following matrix identifies the supported versions of Java and Spark on the Hadoop cluster. Where
Tip |
---|
Tip: Where possible, you should install the same version of Java on |
...
the
|
...
and on the cluster with which you are integrating. |
Notes:
- Java must be installed on each node of the cluster. For more information, see see https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_jdk_installation.html.
- The versions of Java on the the
and and the Hadoop cluster do not have to match.D s item item node
...
Spark 2.2 | Spark 2.3 | Spark 2.4 | |
---|---|---|---|
Java 1.8 | Required. | Required. | Required. |
- If you are integrating with an EMR cluster, there are specific version requirements for EMR. See Configure for Spark in in the Configuration Guide.
Other components
- HDFS Namenode
WebHDFS
SeeInfo NOTE: In HDFS, Append Mode must be enabled.
SeeInfo NOTE: If you are enabling high availability failover, you must use HttpFS, instead of WebHDFS.
inin the Configuration Guide.
For YARN:
- ResourceManager is running.
- ApplicationMaster's range of ephemeral ports are open to the the
.D s node
- HiveServer2:
- HiveServer2 is supported for metadata publishing.
- WebHCat is not supported.
Hadoop System Ports
For more information, see see System Ports.
Site Configuration Files
Hadoop cluster configuration files must be copied into the the
D s item | ||
---|---|---|
|
...
- Kerberos supported:
- If Kerberos is enabled, a keytab file must be accessible to the the
.D s platform - See See Configure for Kerberos Integration in the Configuration Guide.
- If Kerberos is enabled, a keytab file must be accessible to the the
- If Kerberos and secure impersonation are not enabled:
- A user user
must must be created on each node of the Hadoop cluster.D s defaultuser Type hadoop Full true - A directory directory
must must be created on the cluster.D s defaultuser Type hadoop.dir type hadoop.dir Full true full true - The user user
must must have full access to the directory. which enables storage of the transformation recipe back into HDFS.D s defaultuser Type hadoop type hadoop - See See Configure for Hadoop in in the Configuration Guide.
- A user user
Cluster Configuration
For more information on integration with Hadoop, see see Prepare Hadoop for Integration with the Platform.
User Requirements
Users must access the the
D s platform |
---|
I/O Requirements
See See Supported File Formats in the User Guide.