Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This section contains hardware and software requirements for successful installation of of

D s product
rtrue
.

Platform Node Requirements

Node Installation Requirements

...

Info

NOTE: If the

D s platform

...

is installed in a Hadoop environment, the software must be installed on an edge node of the cluster.


  • If it is integrated with a Cloudera cluster, it must be installed on a gateway node that is managed by Cloudera Manager
  •  If it is integrated with a Hortonworks cluster, it must be installed on an Ambari/Hadoop client that is managed by Hortonworks Ambari.


  • If it is integrated with an HDI cluster, it must be installed on an edge node.


  • Customers who originally installed an earlier version on a non-edge node will still be supported. If the software is not installed on an edge node, you may be required to copy over files from the cluster and to synchronize these files after upgrades. The cluster upgrade process is more complicated.


  • This requirement does not apply to the following cluster integrations:
    • AWS EMR
    • Azure Databricks

...

Hardware Requirements

Minimum hardware:

ItemRequired
Number of cores

8 cores

, x86_64

RAM

64 GB

Info

NOTE: The platform requires 12GB of dedicated RAM to start and perform basic operations.


Disk space to install software4 GB
Total free disk space

16 GB

Space requirements by volume:

  • /opt - 10 GB
  • /var - Remainder

Recommended hardware:

ItemRecommended
Number of cores

16 cores

, x86_64

RAM

128 GB

Info

NOTE: The platform requires 12GB of dedicated RAM to start and perform basic operations.


Disk space to install software16 GB
Total free disk space

100 GB

Space requirements by volume:

  • /opt - 10 GB
  • /var - Remainder

Operating System Requirements

The following operating systems are following operating systems are supported for the the

D s node
. The 

Info

NOTE: The

D s platform

...

requires 64-bit versions of any supported operating system.

CentOS/RHEL versions:


  • CentOS 6.4 - 6.x, 7.1, 7.2, 7.4 - 7.6
  • RHEL 6.4 - 6.x, 7.1, 7.2, 7.4 - 7.6

...



Info

NOTE: If you are installing on CentOS/RHEL 7.1, you must be connected to an online repository for some critical updates. Offline installation is not supported for these operating system distributions.


Info

NOTE: For security reasons, RHEL 7.3 is not supported for installation of Release 5.0 or later

...

of the

D s platform
. Please upgrade to RHEL 7.4 or a later supported release.


Info

NOTE: Installation on CentOS/RHEL versions 7.4 or earlier

...

requires an upgrade of the RPM software on

...

the

D s node
. Details are provided during the installation process.


Tip

Tip: Disabling SELinux on

...

the

D s node

...

is recommended. However, if security policies require it, you may need to apply some changes to the environment.

  • Ubuntu

...

  • Ubuntu 14.04 (codename Trusty) and 16.04 (codename Xenial)

Notes on Ubuntu installation:

  • Info

    NOTE: For Ubuntu installations, some packages must be manually installed. Instructions are provided later in the process.


For more information on RPM dependencies, see see System Dependencies.

Database Requirements

The following database versions are supported by the the

D s platform
 for for storing metadata and the user's
D s lang
 recipes
recipes.

Info

NOTE: One of these supported versions must be installed on the

D s node
.

Supported database versions:

  • PostgreSQL 9.6
  • MySQL 5.7 Community

Notes on database versions:

  • MySQL 5.7 is not supported for installation in Amazon RDS.
  • Info

    NOTE: If you are installing the databases into MySQL, you must download and install the MySQL Java driver onto

    the 

    the

    D s node
    . For more information,

    see 

    see Install Databases for MySQL

     in

    in the Databases Guide.


    Info

    NOTE: MySQL 5.7 is not supported for installation in Amazon RDS.


Info

NOTE: H2 database type is used for internal testing. It is not a supported database.


For more information on installing and configuring the database, see see Install Databases in the Databases Guide.

...

The following software components must be present.

Java

Tip

Tip: Where possible, you should install the same version of Java on

...

the

D s node

...

and on the cluster with which you are integrating.


  • Java 1.8

Notes on Java versions:

  • Info

    NOTE: There are additional requirements related to Java JDK listed in the Hadoop Components section listed below.


    Info

    NOTE: OpenJDK 1.8 is officially supported. It is installed on

    the 

    the

    D s node

     during

    during the installation process.

  • There are additional requirements related to Java JDK listed in the Hadoop Components section listed below.

  • Info

    NOTE: If you are integrating

    your 

    your

    D s item
    iteminstance

     with

    with S3, you must install the Oracle JRE 1.8 onto

    the 

    the

    D s node
    . No other version of Java is supported for S3 integration. For more information,

    see 

    see Enable S3 Access

     in

    in the Configuration Guide.


Other Software

Info

NOTE: For Ubuntu installations, the following packages must be manually installed using Ubuntu-specific versions

...

. Instructions and version numbers are provided later in the process.


  • NginX 1.12.2
  • NodeJS 10.13.0

Instructions and version numbers are provided later in the process.

Root User Access

Installation must be executed as the root user on the the

D s node
.

SSL Access

(Optional) If users are connecting to the the

D s platform
, an SSL certificate must be created and deployed. See See Install SSL Certificate in the Install Guide.

...

(Optional) Internet access is not required for installation or operation of the platform. However, if the server does not have Internet access, you must acquire additional software as part of the disconnected install. For more information, see see Install Dependencies without Internet Access  in the Install Guide.

...

The following requirements apply if you are integrating the the

D s platform
 with with an enterprise Hadoop cluster.

Info

NOTE: For general guidelines on sizing the cluster,

...

see Sizing Guidelines.


Info

NOTE: If you have upgrades to the Hadoop cluster planned for the next year, you should review those plans with Support prior to installation. For more information, please

...

contact

D s support
.

Supported Hadoop Distributions

The 

D s platform
 supports the following minimum Hadoop distributions.

...

Info

NOTE: The

D s platform

...

only supports the latest major release and its minor releases of each distribution.

...

The

D s platform

...

only supports the versions of any required components included in a supported distribution. Even if they are upgraded components, use of non-default versions of required components is not supported.


The

D s platform
supports the following minimum Hadoop distributions.

Cloudera supported distributions

  • CDH 6.Recommended

  • CDH 6.

  • CDH 6.10

    Info

    NOTE: CDH 6.x requires that you use the native Spark libraries provided by the cluster. Additional configuration is required. For more information, see Configure for Spark in the Configuration Guide.


  • CDH 5.16  Recommended

See See Supported Deployment Scenarios for Cloudera in the Install Guide.

...

  • HDP 3.1   Recommended

  • HDP 3.0

    Info

    NOTE: HDP 3.x requires that you use the native Spark libraries provided by the cluster. Additional configuration is required. For more information, see Configure for Spark in the Configuration Guide.


  • HDP 2.6

See Supported Deployment Scenarios for Hortonworks  in the Install Guide.

EMR supported distributions

See Configure for EMR in the Configuration Guide.

HDInsight supported distributions

See Configure for HDInsight in the Configuration Guide.

Azure Databricks supported distributions

See Configure for Azure Databricks in the Configuration Guide.


Node Requirements

Each cluster node must have the following software:

  • Java JDK 1.8 (some exceptions may be listed below)

Hadoop Component Access

The The

D s item
itemdeployment
 must must have access to the following.

...

The following matrix identifies the supported versions of Java and Spark on the Hadoop cluster. Where

Tip

Tip: Where possible, you should install the same version of Java on

...

the

D s node

...

and on the cluster with which you are integrating.

Notes:

...


Spark 2.2Spark 2.3Spark 2.4
Java 1.8Required.Required.Required.


  • If you are integrating with an EMR cluster, there are specific version requirements for EMR. See Configure for Spark in in the Configuration Guide.


Other components

  • HDFS Namenode
  • For YARN:

    • ResourceManager is running.
    • ApplicationMaster's range of ephemeral ports are open to the the
      D s node
      .
  • HiveServer2:
    • HiveServer2 is supported for metadata publishing.
    • WebHCat is not supported.

Hadoop System Ports

For more information, see see System Ports.

Site Configuration Files

Hadoop cluster configuration files must be copied into the the

D s item
itemdeployment
. See See Configure for Hadoop in the Configuration Guide.

...

  • Kerberos supported:
  • If Kerberos and secure impersonation are not enabled:
    • A user user
      D s defaultuser
      Typehadoop
      Fulltrue
       must must be created on each node of the Hadoop cluster.
    • A directory directory
      D s defaultuser
      Typehadoop.dir
      typehadoop.dir
      Fulltrue
      fulltrue
       must must be created on the cluster.
    • The user user
      D s defaultuser
      Typehadoop
      typehadoop
       must must have full access to the directory. which enables storage of the transformation recipe back into HDFS.
    • See See Configure for Hadoop in  in the Configuration Guide.

Cluster Configuration

For more information on integration with Hadoop, see see Prepare Hadoop for Integration with the Platform.

User Requirements

Users must access the the

D s platform
 through through one of the supported browser versions. For more information on user system requirements, see see Desktop Requirements.

I/O Requirements

See See Supported File Formats in the User Guide.