Scenario DescriptionNOTE: All hardware in use for supporting the platform is maintained within your enterprise infrastructure on AWS. |
- Installation of
on an EC2 server in AWS - Installation of
on AWS - Integration with a supported EMR cluster.
- Base storage layer and backend datastore of S3
For more information on deployment scenarios, see Supported Deployment Scenarios for AWS. LimitationsDeployment LimitationsThe following limitations apply to installations of on AWS: - No support for high availability and failover
- Job cancellation is not supported on EMR.
- When publishing single files to S3, you cannot apply an
append publishing action. - The following limitations apply to EMR integration only:
- No support for Hive integration
- No support for secure impersonation or Kerberos
Product LimitationsFor general limitations of , see Product Limitations in the Planning Guide. Pre-requisitesPlease acquire the following assets: - Install Package: Acquire the installation package for your operating system.
- License Key: As part of the installation package, you should receive a license key file. See License Key for details.
- For more information, contact
.
- Offline system dependencies: If you are completing the installation without Internet access, you must also acquire the offline versions of the system dependencies. See Install Dependencies without Internet Access.
AWS desktop requirements- All desktop users must be able to connect to the EC2 instance through the enterprise infrastructure.
AWS pre-requisitesDepending on which of the following AWS components you are deploying, additional pre-requisites and limitations may apply. Please review these sections as well. PreparationBefore you install on AWS, please verify that you have completed the following: - Read: Please read this entire document before you create the EMR cluster or install the
. - VPC: Enable and deploy a working AWS VPC.
- In your VPC. you must define a subnet where you plan to deploy the
.
S3: Enable and deploy an AWS S3 bucket to use as the base storage layer for the platform. In the bucket, the platform stores metadata in the following location: <S3_bucket_name>/trifacta |
See https://s3.console.aws.amazon.com/s3/home. - IAM Policies: Create IAM policies for access to the S3 bucket. Required permissions are the following:
- EC2 instance role: Create an EC2 instance role for your S3 bucket policy. See https://console.aws.amazon.com/iam/home#/roles.
- EC2 instance: Deploy an AWS EC2 with SELinux where the
can be installed.The required set of ports must be enabled for listening. See System Ports in the Planning Guide. This node should be dedicated for . NOTE: The EC2 node must meet the system requirements for installing the platform. For more information, see System Requirements in the Planning Guide. |
- EMR cluster: An existing EMR cluster is required.
Cluster sizing: Before you begin, you should allocate sufficient resources for sizing the cluster. For guidance, please contact your . - See Deploy the Cluster below.
- Databases:
- The platform utilizes a set of databases that must be accessed from the
. Databases are installed as part of the workflow described later. - If installing databases on Amazon RDS, an admin account to RDS is required. For more information, see Install Databases on Amazon RDS.
AWS InformationBefore you begin installation, please acquire the following information from AWS: - EMR:
- AWS region for the EMR cluster, if it exists.
- ID for EMR cluster, if it exists
- If you are creating an EMR cluster as part of this process, please retain the ID.
- The EMR cluster must allow access from the
. This configuration is described later.
- Subnet: Subnet within your virtual private cloud (VPC) where you want to launch the
.- This subnet should be in the same VPC as the EMR cluster.
- Subnet can be private or public.
- If it is private and it cannot access the Internet, additional configuration is required. See below.
- S3:
- EC2:
- Instance type for the

Internet access
|