Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Info

NOTE: Content in this section does not apply to deployments from the AWS Marketplace, which provide fewer deployment and configuration options. For more information, see the AWS Marketplace.



Excerpt

Scenario Description

Info

NOTE: All hardware in use for supporting the platform is maintained within the enterprise infrastructure on AWS.

  • Installation of 
    D s product
     on an EC2 server in AWS
  • Installation of 
    D s item
    itemdatabases
     on AWS
  • Integration with a supported EMR cluster.
  • Base storage layer and backend datastore of S3
Info

NOTE: When the above installation and configuration steps have been completed, the platform is operational. Additional configuration may be required, which is referenced at the end of this section.

For more information on deployment scenarios, see Supported Deployment Scenarios for AWS.

Product Limitations

The following limitations apply to installations of 

D s product
productee
 on AWS:

  • No support for high availability and failover
  • Job cancellation is not supported on EMR.
  • When publishing single files to S3, you cannot apply an append publishing action.
  • The following limitations apply to EMR integration only:
    • No support for Hive integration
    • No support for secure impersonation or Kerberos

Pre-requisites

Desktop Requirements

  • All desktop users of the platform should have a supported version of Google Chrome installed on their desktops.
    • For more information. see Desktop Requirements.
    • If a supported browser is not available within your enterprise, desktop users can install the 
      D s item
      itementerprise application
       as a separate application. For more information, see Install Desktop Application.
  • All desktop users must be able to connect to the EC2 instance through the enterprise infrastructure.

AWS Pre-requisites

Depending on which of the following AWS components you are deploying, additional pre-requisites and limitations may apply. Please review these sections as well.

Prep

Before you begin, please verify that you have completed the following:

 

  1. Review Planning Guide: Please review and verify Install Preparation and sub-topics.
    1. Limitations: For more information on limitations of this scenario, see Product Limitations in the Install Preparation area.
  2. Read: Please read this entire document before you create the EMR cluster or install the 

    D s platform
    .

  3. Acquire Assets: Acquire the installation package for your operating system and your license key. For more information, contact 
    D s support
    .
    1. If you are completing the installation without Internet access, you must also acquire the offline versions of the system dependencies. See Install Dependencies without Internet Access.
  4. VPC: Enable and deploy a working AWS VPC.
  5. S3: Enable and deploy an AWS S3 bucket to use as the base storage layer for the platform. In the bucket, the platform stores metadata in the following location:

    Code Block
    <S3_bucket_name>/trifacta

    See https://s3.console.aws.amazon.com/s3/home.

  6. IAM Policies: Create IAM policies for access to the S3 bucket. Required permissions are the following: 
    • The system account or individual user accounts must have full permissions for the S3 bucket:

      Code Block
      Delete*, Get*, List*, Put*, Replicate*, Restore*


    • These policies must apply to the bucket and its contents. Example:

      Code Block
      "arn:aws:s3:::my-trifacta-bucket-name"
      "arn:aws:s3:::my-trifacta-bucket-name/*"


    • See https://console.aws.amazon.com/iam/home#/policies
  7. EC2 instance role: Create an EC2 instance role for your S3 bucket policy. See https://console.aws.amazon.com/iam/home#/roles.
  8. EC2 instance: Deploy an AWS EC2 with SELinux where the 
    D s item
    itemsoftware
     can be installed.
    1. The required set of ports must be enabled for listening. See System Ports.

    2. This node should be dedicated for 

      D s item
      itemuse
      .

      Info

      NOTE: The EC2 node must meet the system requirements. For more information, see System Requirements

...

    1. .


  1. EMR cluster: An existing EMR cluster is required. 
    1. Cluster sizing: Before you begin, you should allocate sufficient resources for sizing the cluster. For guidance, please contact your 

      D s item
      itemrepresentative
      .

    2. See Deploy the Cluster below.
  2. Databases: 
    1. The platform utilizes a set of databases that must be accessed from the 
      D s node
      . Databases are installed as part of the workflow described later.
    2. For more information on the supported databases and versions, see System Requirements.
    3. For more information on database installation requirements, see Install Databases.
    4. If installing databases on Amazon RDS an admin account to RDS is required. For more information, see Install Databases on Amazon RDS.

AWS Information

Before you begin installation, please acquire the following information from AWS:

  • EMR:
    • AWS region for the EMR cluster, if it exists.
    • ID for EMR cluster, if it exists
      • If you are creating an EMR cluster as part of this process, please retain the ID.
      • The EMR cluster must allow access from the 
        D s node
        . This configuration is described later.
  • Subnet: Subnet within your virtual private cloud (VPC) where you want to launch the 
    D s platform
    .
    • This subnet should be in the same VPC as the EMR cluster.
    • Subnet can be private or public.
    • If it is private and it cannot access the Internet, additional configuration is required. See below.
  • S3:
    • Name of the S3 bucket that the platform can use
    • Path to resources on the S3 bucket

  • EC2: 
    • Instance type for the 
      D s node

Internet access

D excerpt include
pageConfigure for AWS
nopaneltrue

Deploy the Cluster

In your AWS infrastructure, you must deploy a supported version of EMR across a recommended number of nodes to support the expected data volumes of your 

D s item
itemjobs
.

...

Excerpt

After you have completed, the above, please complete these steps listed in order:

1 - Install software

Install the 

D s platform
 software on the EC2 node you created. See Install Software.

2 - Install databases

The platform requires several databases for storing metadata.

NOTE: The software assumes that you are installing the databases on a PostgreSQL server on the same node as the software. If you are not or are changing database names or ports, additional configuration is required as part of this installation process.

For more information, see Install Databases in the Databases Guide.

3 - Login to the application

After software and databases are installed, you can login to the application to complete configuration. See Login.

As soon as you login, you should change the password on the admin account. In the left menu bar, select Settings > Admin Settings. Scroll down to Manage Users. For more information, see Change Admin Password.

Tip

Tip: At this point, you can access the online documentation through the application. In the left menu bar, select Help menu > Product Docs. All of the following content, plus updates, is available online. See Documentation below.


Configure for EMR

Info

NOTE: If you are creating a new EMR cluster as part of this installation process, please skip this section. That workflow is covered later in the document. For more information, see Configure for EMR.

...