Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r092

D toc

AWS Deployment Scenarios

The following are the basic AWS deployment scenarios.

D s platform
 deployed through AWS Marketplace:

Deployment Scenario

D s node
installation

Base Storage LayerStorage - S3Storage - RedshiftClusterNotes

D s product
productedge
AWS install through AWS Marketplace CloudFormation template

EC2S3read/writeread/write None

D s product
productedge
does not support integration with any running environment clusters. All job execution occurs on the
D s node
in the
D s photon
running environment. This scenario is suitable for smaller user groups and data volumes.

D s product
AWS install through AWS Marketplace CloudFormation template - with integration to EMR cluster

EC2S3read/writeread/write EMR

This deployment scenario integrates by default with an EMR cluster, which is created as part of the process.

It does not support integration with a Hadoop cluster.


D s platform
 installed on AWS:

Deployment Scenario

D s node
installation

Base Storage LayerStorage - S3Storage - RedshiftClusterNotes

D s product
 AWS install with S3 read access

EC2HDFSread onlyNot supportedEMRWhen HDFS is the base storage layer, the only accessible AWS resources is read-only access to S3.

D s product
 AWS install with S3 read/write access

EC2S3read/writeread/write EMR  

D s product
 AWS install with S3 read/write access

EC2S3read/writeread/write AWS Databricks  

D s platform
 installed on-premises and integrated with AWS resources:

Deployment Scenario

D s node
installation

Base Storage LayerStorage - S3Storage - RedshiftClusterNotes

D s product
on-premises install with S3 read access

On-premisesHDFSread onlyNot supportedHadoop

When HDFS is the base storage layer, the only accessible AWS resources is read-only access to S3.

Microsoft Azure     Integration with AWS-based resources is not supported.

 

Legend and Notes:

ColumnNotes
Deployment ScenarioDescription of the AWS-connected deployment

D s node
installation

Location where the

D s node
is installed in this scenario.

All AWS installations are installed on EC2 instances.

Base Storage Layer

When the

D s platform
is first installed, the base storage layer must be set.

Info

NOTE: After you have begun using the product, you cannot change the base storage layer.

Info

NOTE: Read/write access to AWS-based resources requires that S3 be set as the base storage layer.

Storage - S3

D s product
supports read access to S3 when the base storage layer is set to HDFS.

For read/write access to S3, the base storage layer must be set to S3.

Info

NOTE: Users can enable access to other S3 buckets by creating new connections through the

D s webapp
. For more information, see Connection Types.

Storage - RedshiftFor access to Redshift, the base storage layer must be set to S3.
Cluster

List of cluster types that are supported for integration and job execution at scale.

  • The
    D s platform
    can integrate with at most one cluster. It cannot integrate with two different clusters at the same time.
  • Access to an EMR cluster requires S3 to be the base storage layer.
  • Smaller jobs can be executed on the
    D s photon
    running environment, which is hosted on the
    D s node
    itself.
  • For more information, see Running Environment Options in the Configuration Guide.
NotesAny additional notes

AWS Installations

D s product
productedge
 on AWS Marketplace (AMI)

Through the Amazon Marketplace, you can license and deploy an AMI of 

D s product
productedge
, which does not require integration with a clustered running environment. All job execution happens within the AMI on the EC2 instance that you deploy. For more information, see the 
D s product
productedge
 listing on AWS Marketplace.

D s product
 on AWS Marketplace with EMR

You can deploy an AMI of the 

D s platform
 onto an EC2 instance. For more information, see the 
D s product
 listing for AWS Marketplace.

You can deploy it in either of the following ways:

  1. Auto-create a 3-node EMR cluster. 
  2. Integrate it later with your pre-existing EMR cluster. 

For more information, see the

D s product
productee
listing on AWS Marketplace.

D s product
 on EC2 Instance

When the 

D s platform
 is installed on AWS, it is deployed on an EC2 instance. Through the EC2 console, there are a few key parameters that must be specified. 

Info

NOTE: After you have created the instance, you should retain the instanceId from the console, which must be applied to the configuration in the

D s platform
.

AWS Integrations

The following table describes the different AWS components that can host or integrate with the 

D s platform
. Combinations of one or more of these items constitute one of the deployment scenarios listed in the following section.

AWS ServiceDescriptionBase Storage LayerOther Required AWS Services
EC2

Amazon Elastic Compute Cloud (EC2) can be used to host the

D s node
in a scalable cloud-based environment. The following deployments are supported:

  • D s product
    productee
    with or without access to an EMR cluster

  • D s product
    productedge
    on an AMI

Base storage layer can be S3 or HDFS.

If set to HDFS, only read access to S3 is permitted.

 
S3

Amazon Simple Storage Service (S3) can be used for reading data sources, writing job results, and hosting the

D s item
itemdatabases
.

Access to specific S3 buckets can be enabled through new connections.

Base storage layer can be S3 or HDFS.

If set to HDFS, only read access to S3 is permitted.

 
Redshift

Amazon Redshift provides a scalable data warehouse platform, designed for big data analytics applications. The

D s platform
can be configured to read and write from Amazon Redshift database tables.

Base Storage Layer = S3

S3
EMR

For more information on supported versions of EMR, see Configure for EMR in the Configuration Guide.

Base Storage Layer = S3

EC2 instance

Amazon RDS

Optionally, the

D s item
itemdatabases
can be installed on Amazon RDS.For more information, see Install Databases on Amazon RDS in the Databases Guide.

Base Storage Layer = S3 

AWS Marketplace integrations:

AWS ServiceDescriptionBase Storage LayerOther Required AWS Services
AMI

Through the AWS Marketplace, you can license and install an Amazon Machine Image (AMI) instance of

D s product
productedge
. This product is intended for smaller user groups that do not need large-scale processing of Hadoop-based clusters.

Base Storage Layer = S3

Info

NOTE: HDFS is not supported.

EC2 instance
EMR

Through the AWS Marketplace, you can license and install an AMI specifically configured to work with Amazon Elastic Map Reduce (EMR), a Hadoop-based data processing platform.

Base Storage Layer = S3AMI