This section contains hardware and software requirements for successful installation of .
NOTE: If the |
NOTE: If you are installing the |
Minimum hardware:
Item | Required | |
---|---|---|
Number of cores | 8 cores | |
RAM | 64 GB
| |
Disk space to install software | 4 GB | |
Total free disk space | 16 GB Space requirements by volume:
|
Recommended hardware:
Item | Recommended | |
---|---|---|
Number of cores | 16 cores | |
RAM | 128 GB
| |
Disk space to install software | 16 GB | |
Total free disk space | 100 GB Space requirements by volume:
|
The following operating systems are supported for the .
NOTE: The |
RHEL 6.4 - 6.x, 7.1, 7.2, 7.4 - 7.6
NOTE: If you are installing on CentOS/RHEL 7.1, you must be connected to an online repository for some critical updates. Offline installation is not supported for these operating system distributions. |
NOTE: For security reasons, RHEL 7.3 is not supported for installation of Release 5.0 or later of the |
NOTE: Installation on CentOS/RHEL versions 7.4 or earlier requires an upgrade of the RPM software on the |
Tip: Disabling SELinux on the |
Ubuntu 14.04 (codename Trusty) and 16.04 (codename Xenial)
NOTE: For Ubuntu installations, some packages must be manually installed. Instructions are provided later in the process. |
For more information on RPM dependencies, see System Dependencies.
The following database versions are supported by the for storing metadata and the user's
recipes.
NOTE: One of these supported versions must be installed on the |
Supported versions:
MySQL 5.7 Community
NOTE: If you are installing the databases into MySQL, you must download and install the MySQL Java driver onto the |
NOTE: MySQL 5.7 is not supported for installation in Amazon RDS. |
NOTE: H2 database type is used for internal testing. It is not a supported database. |
For more information on installing and configuring the database, see Install Databases in the Databases Guide.
The following software components must be present.
Tip: Where possible, you should install the same version of Java on the |
Java 1.8
NOTE: There are additional requirements related to Java JDK listed in the Hadoop Components section listed below. |
NOTE: OpenJDK 1.8 is officially supported. It is installed on the |
NOTE: If you are integrating your |
NOTE: For Ubuntu installations, the following packages must be manually installed using Ubuntu-specific versions. Instructions and version numbers are provided later in the process. |
Installation must be executed as the root user on the .
(Optional) If users are connecting to the , an SSL certificate must be created and deployed. See Install SSL Certificate in the Install Guide.
(Optional) Internet access is not required for installation or operation of the platform. However, if the server does not have Internet access, you must acquire additional software as part of the disconnected install. For more information, see Install Dependencies without Internet Access in the Install Guide.
The following requirements apply if you are integrating the with an enterprise Hadoop cluster.
NOTE: For general guidelines on sizing the cluster, see Sizing Guidelines. |
NOTE: If you have upgrades to the Hadoop cluster planned for the next year, you should review those plans with Support prior to installation. For more information, please contact |
NOTE: The The |
The supports the following minimum Hadoop distributions.
CDH 6.2 Recommended
CDH 6.1
CDH 6.0
NOTE: CDH 6.x requires that you use the native Spark libraries provided by the cluster. Additional configuration is required. For more information, see Configure for Spark in the Configuration Guide. |
See Supported Deployment Scenarios for Cloudera in the Install Guide.
HDP 3.1 Recommended
HDP 3.0
NOTE: HDP 3.x requires that you use the native Spark libraries provided by the cluster. Additional configuration is required. For more information, see Configure for Spark in the Configuration Guide. |
See Supported Deployment Scenarios for Hortonworks in the Install Guide.
See Configure for EMR in the Configuration Guide.
See Configure for HDInsight in the Configuration Guide.
See Configure for Azure Databricks in the Configuration Guide.
Each cluster node must have the following software:
Java JDK 1.8 (some exceptions may be listed below)
The must have access to the following.
The following matrix identifies the supported versions of Java and Spark on the Hadoop cluster.
Tip: Where possible, you should install the same version of Java on the |
Notes:
Spark 2.2 | Spark 2.3 | Spark 2.4 | |
---|---|---|---|
Java 1.8 | Required. | Required. | Required. |
WebHDFS
NOTE: In HDFS, Append Mode must be enabled. See Prepare Hadoop for Integration with the Platform. |
NOTE: If you are enabling high availability failover, you must use HttpFS, instead of WebHDFS. See Enable Integration with Cluster High Availability in the Configuration Guide. |
For YARN:
For more information, see System Ports.
Hadoop cluster configuration files must be copied into the . See Configure for Hadoop in the Configuration Guide.
For more information on integration with Hadoop, see Prepare Hadoop for Integration with the Platform.
Users must access the through one of the supported browser versions. For more information on user system requirements, see Desktop Requirements.
See Supported File Formats in the User Guide.