Skip to main content

Supported Deployment Scenarios for AWS

This section describes supported deployment scenarios for Designer Cloud Powered by Trifacta Enterprise Edition on customer-managed AWS infrastructure.

AWS Deployment Scenarios

The following are the basic AWS deployment scenarios.

Designer Cloud Powered by Trifacta platform deployed through AWS Marketplace:

Deployment Scenario

Trifacta node installation

Base Storage Layer

Storage - S3

Storage - Redshift

Cluster

Notes

Data Preparation for Amazon Redshift and S3 AWS install through AWS Marketplace CloudFormation template

EC2

S3

read/write

read/write

None

Data Preparation for Amazon Redshift and S3 does not support integration with any running environment clusters. All job execution occurs on the Trifacta node in the Trifacta Photon running environment. This scenario is suitable for smaller user groups and data volumes.

Designer Cloud Powered by Trifacta Enterprise Edition AWS install through AWS Marketplace CloudFormation template - with integration to EMR cluster

EC2

S3

read/write

read/write

EMR

This deployment scenario integrates by default with an EMR cluster, which is created as part of the process.

It does not support integration with a Hadoop cluster.

Designer Cloud Powered by Trifacta platform installed on AWS:

Deployment Scenario

Trifacta node installation

Base Storage Layer

Storage - S3

Storage - Redshift

Cluster

Notes

Designer Cloud Powered by Trifacta Enterprise Edition AWS install with S3 read access

EC2

HDFS

read only

Not supported

EMR

When HDFS is the base storage layer, the only accessible AWS resources is read-only access to S3.

Designer Cloud Powered by Trifacta Enterprise Edition AWS install with S3 read/write access

EC2

S3

read/write

read/write

EMR

Designer Cloud Powered by Trifacta Enterprise Edition AWS install with S3 read/write access

EC2

S3

read/write

read/write

AWS Databricks

Designer Cloud Powered by Trifacta platform installed on-premises and integrated with AWS resources:

Deployment Scenario

Trifacta node installation

Base Storage Layer

Storage - S3

Storage - Redshift

Cluster

Notes

Designer Cloud Powered by Trifacta Enterprise Edition on-premises install with S3 read access

On-premises

HDFS

read only

Not supported

Hadoop

When HDFS is the base storage layer, the only accessible AWS resources is read-only access to S3.

Microsoft Azure

Integration with AWS-based resources is not supported.

Legend and Notes:

Column

Notes

Deployment Scenario

Description of the AWS-connected deployment

Trifacta node installation

Location where the Trifacta node is installed in this scenario.

All AWS installations are installed on EC2 instances.

Base Storage Layer

When the Designer Cloud Powered by Trifacta platform is first installed, the base storage layer must be set.

Note

After you have begun using the product, you cannot change the base storage layer.

Note

Read/write access to AWS-based resources requires that S3 be set as the base storage layer.

Storage - S3

Designer Cloud Powered by Trifacta Enterprise Edition supports read access to S3 when the base storage layer is set to HDFS.

For read/write access to S3, the base storage layer must be set to S3.

Note

Users can enable access to other S3 buckets by creating new connections through the Trifacta Application. For more information, see Connection Types.

Storage - Redshift

For access to Redshift, the base storage layer must be set to S3.

Cluster

List of cluster types that are supported for integration and job execution at scale.

  • The Designer Cloud Powered by Trifacta platform can integrate with at most one cluster. It cannot integrate with two different clusters at the same time.

  • Access to an EMR cluster requires S3 to be the base storage layer.

  • Smaller jobs can be executed on the Trifacta Photon running environment, which is hosted on the Trifacta node itself.

  • For more information, see Running Environment Options in the Configuration Guide.

Notes

Any additional notes

AWS Installations

Data Preparation for Amazon Redshift and S3 on AWS Marketplace (AMI)

Through the Amazon Marketplace, you can license and deploy an AMI of Data Preparation for Amazon Redshift and S3, which does not require integration with a clustered running environment. All job execution happens within the AMI on the EC2 instance that you deploy. For more information, see the Data Preparation for Amazon Redshift and S3 listing on AWS Marketplace.

Designer Cloud Enterprise Edition on AWS Marketplace with EMR

You can deploy an AMI of the Designer Cloud Powered by Trifacta platform onto an EC2 instance. For more information, see the Designer Cloud Powered by Trifacta Enterprise Edition listing for AWS Marketplace.

You can deploy it in either of the following ways:

  1. Auto-create a 3-node EMR cluster.

  2. Integrate it later with your pre-existing EMR cluster.

For more information, see the Designer Cloud Powered by Trifacta Enterprise Edition listing on AWS Marketplace.

Designer Cloud Enterprise Edition on EC2 Instance

When the Designer Cloud Powered by Trifacta platform is installed on AWS, it is deployed on an EC2 instance. Through the EC2 console, there are a few key parameters that must be specified.

Note

After you have created the instance, you should retain the instanceId from the console, which must be applied to the configuration in the Designer Cloud Powered by Trifacta platform.

AWS Integrations

The following table describes the different AWS components that can host or integrate with the Designer Cloud Powered by Trifacta platform. Combinations of one or more of these items constitute one of the deployment scenarios listed in the following section.

AWS Service

Description

Base Storage Layer

Other Required AWS Services

EC2

Amazon Elastic Compute Cloud (EC2) can be used to host the Trifacta node in a scalable cloud-based environment. The following deployments are supported:

  • Designer Cloud Powered by Trifacta Enterprise Edition with or without access to an EMR cluster

  • Data Preparation for Amazon Redshift and S3 on an AMI

Base storage layer can be S3 or HDFS.

If set to HDFS, only read access to S3 is permitted.

S3

Amazon Simple Storage Service (S3) can be used for reading data sources, writing job results, and hosting the Alteryx databases.

Access to specific S3 buckets can be enabled through new connections.

Base storage layer can be S3 or HDFS.

If set to HDFS, only read access to S3 is permitted.

Redshift

Amazon Redshift provides a scalable data warehouse platform, designed for big data analytics applications. The Designer Cloud Powered by Trifacta platform can be configured to read and write from Amazon Redshift database tables.

Base Storage Layer = S3

S3

EMR

For more information on supported versions of EMR, seeConfigure for EMR in the Configuration Guide.

Base Storage Layer = S3

EC2 instance

Amazon RDS

Optionally, the Alteryx databases can be installed on Amazon RDS. For more information, seeInstall Databases on Amazon RDSin the Databases Guide.

Base Storage Layer = S3

AWS Marketplace integrations:

AWS Service

Description

Base Storage Layer

Other Required AWS Services

AMI

Through the AWSMarketplace, you can license and install an Amazon Machine Image (AMI) instance ofData Preparation for Amazon Redshift and S3. This product is intended for smaller user groups that do not need large-scale processing of Hadoop-based clusters.

Base Storage Layer = S3

Note

HDFS is not supported.

EC2 instance

EMR

Through the AWS Marketplace, you can license and install an AMI specifically configured to work with Amazon Elastic Map Reduce (EMR), a Hadoop-based data processing platform.

Base Storage Layer = S3

AMI