This section contains hardware and software requirements for successful installation of .
If the is installed in a Hadoop environment, the software must be installed on an edge node of the cluster.
NOTE: If you are installing the |
Minimum hardware:
Item | Required |
---|---|
Number of cores | 8 cores, x86_64 |
RAM | 64 GB The platform requires 12GB of dedicated RAM to start and perform basic operations. |
Disk space to install software | 4 GB |
Total free disk space | 16 GB Space requirements by volume:
|
Recommended hardware:
Item | Recommended |
---|---|
Number of cores | 16 cores, x86_64 |
RAM | 128 GB The platform requires 12GB of dedicated RAM to start and perform basic operations. |
Disk space to install software | 16 GB |
Total free disk space | 100 GB Space requirements by volume:
|
The following operating systems are supported for the . The
requires 64-bit versions of any supported operating system.
CentOS/RHEL versions:
CentOS 7.4 - 7.7, 8.1
NOTE: MySQL 5.7 Community is not supported on CentOS/RHEL 8.1. |
RHEL 7.4 - 7.7, 8.1
Notes on CentOS/RHEL installation:
Ubuntu versions:
Ubuntu 18.04 (codename Bionic Beaver)
Ubuntu 16.04 (codename Xenial)
Notes on Ubuntu installation:
For more information on RPM dependencies, see System Dependencies.
The following database versions are supported by the for storing metadata and the user's
recipes.
Supported database versions:
PostgreSQL 12.3
MySQL 5.7 Community
NOTE: MySQL 5.7 Community is not supported on CentOS/RHEL 8.1. |
Notes on database versions:
MySQL 5.7 is not supported for installation in Amazon RDS.
NOTE: If you are installing or upgrading a deployment of |
For more information on installing and configuring the database, see Install Databases in the Databases Guide.
The following software components must be present.
Where possible, you should install the same version of Java on the and on the cluster with which you are integrating.
Java 1.8
Notes on Java versions:
OpenJDK 1.8 is officially supported. It is installed on the during the installation process.
NOTE: If you are using Azure Databricks as a datasource, please verify that openJDKv1.8.0_242 or earlier is installed on the |
For Ubuntu installations, the following packages must be manually installed using Ubuntu-specific versions:
Instructions and version numbers are provided later in the process.
Installation must be executed as the root user on the .
(Optional) If users are connecting to the , an SSL certificate must be created and deployed. See Install SSL Certificate in the Install Guide.
(Optional) Internet access is not required for installation or operation of the platform. However, if the server does not have Internet access, you must acquire additional software as part of the disconnected install. For more information, see Install Dependencies without Internet Access in the Install Guide.
The following requirements apply if you are integrating the with an enterprise Hadoop cluster.
The supports the following minimum Hadoop distributions.
CDH 6.3 Recommended
CDH 6.2
CDH 6.1
NOTE: CDH 6.x requires that you use the native Spark libraries provided by the cluster. Additional configuration is required. For more information, see Configure for Spark in the Configuration Guide. |
See Supported Deployment Scenarios for Cloudera in the Install Guide.
HDP 3.1 Recommended
HDP 3.0
NOTE: HDP 3.x requires that you use the native Spark libraries provided by the cluster. Additional configuration is required. For more information, see Configure for Spark in the Configuration Guide. |
See Supported Deployment Scenarios for Hortonworks in the Install Guide.
See Configure for EMR in the Configuration Guide.
See Configure for HDInsight in the Configuration Guide.
See Configure for Azure Databricks in the Configuration Guide.
Each cluster node must have the following software:
Java JDK 1.8 (some exceptions may be listed below)
The must have access to the following.
The following matrix identifies the supported versions of Java and Spark on the Hadoop cluster. Where possible, you should install the same version of Java on the and on the cluster with which you are integrating.
Notes:
Spark 2.3 | Spark 2.4 | |
---|---|---|
Java 1.8 | Required. | Required. |
WebHDFS
In HDFS, Append Mode must be enabled. See Prepare Hadoop for Integration with the Platform.
For YARN:
For more information, see System Ports.
Hadoop cluster configuration files must be copied into the . See Configure for Hadoop in the Configuration Guide.
For more information on integration with Hadoop, see Prepare Hadoop for Integration with the Platform.
Users must access the through one of the supported browser versions. For more information on user system requirements, see Desktop Requirements.
See Supported File Formats in the User Guide.