Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r0762

...

  1. Instance role: Create an IAM role and link it to the EC2 instance where the 

    D s node
     is hosted. 

    1. Include the following IAM policy: 

      Code Block
      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Sid": "VisualEditor0",
                  "Effect": "Allow",
                  "Action": "sts:AssumeRole",
                  "Resource": "arn:aws:iam::*:role/*"
              }
          ]
      } 
    2. For more information, see https://aws.amazon.com/premiumsupport/knowledge-center/assign-iam-role-ec2-instance/.
  2. User role: Create another IAM role and provides required access to the S3 buckets. Example:


    Code Block
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "MyBucketAndObjectPermissions",
                "Action": [
                    "s3:GetBucketLocation",
                    "s3:ListBucket",
                    "s3:DeleteObject",
                    "s3:GetObject",
                    "s3:PutObject"
                ],
                "Effect": "Allow",
                "Resource": [
                    "arn:aws:s3:::<my_s3_bucket>",
                    "arn:aws:s3:::<my_s3_bucket>/*"
                ]
            },
            {
                "Sid": "TrifactaPublicDatasets",
                "Effect": "Allow",
                "Action": [
                    "s3:GetObject",
                    "s3:ListBucket"
                ],
                "Resource": [
                    "arn:aws:s3:::trifacta-public-datasets/*",
                    "arn:aws:s3:::trifacta-public-datasets"
                ]
            }
        ]
    }

    where:
    <my_s3_bucket> is the name of your bucket.

  3. Under the user role definition, edit the Trust relationship. Add the instance role to Principal:

    Code Block
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": [
            "arn:aws:iam::       awsAccountId:role/instanceRole"
            ]
          },
          "Action": "sts:AssumeRole"
        }
      ]
    }

    1. For more information, see Insert Trust Relationship in AWS IAM Role.

    2. For more granular control over the Trust relationship, see https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_principal.html.
  4. AWS Glue: If you are integrating with AWS Glue, additional permissions must be set. For more information, see Enable AWS Glue Access.

  5. Log in the 

    D s platform
     as a 
    D s item
    itemadmin
    .

  6. Click the link to specify storage settings. Populate the values for:
    1. IAM role
    2. Role ARN
    3. S3 Bucket Name
  7. Save your changes.

Enable Attribute-Based Access to S3

When IAM roles are used for per-user authentication, 

D s product
 can be configured to pass an additional attribute as part of any request for S3 resources through AWS Secure Token Service. This attribute, called a session tag, contains the 
D s item
itemuser identifier
, which is the username part of the user's email address. This userId is used as the key within S3 to identify the permissions available to the user on S3. In this manner, you can leverage your existing enterprise S3 permissioning for more precise access, without having to replicate the permissioning in 
D s product
.

Info

NOTE: After enabling the use of session tags, you must spin up a new EMR cluster, which forces EMR to use the newly deployed credential provider JAR file.

For more information on session tags, see https://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_attribute-based-access-control.html.

Pre-requisites

  • S3 must be set as the base storage layer. For more information, see Set Base Storage Layer.
  • D s product
     must be configured to use IAM roles through the temporary credential provider mechanism for per-user authentication to AWS. See above.
  • A userId must be matched to the identifier that is used within the enterprise infrastructure to define S3 access.
  • If you are running jobs on EMR, EMR 5.29.0 and later is supported.

Specify general Hadoop bundle JAR file

This feature requires that you deploy the generic Hadoop bundle JAR file for use when running Spark jobs. Version-specific bundle JARs, which are used by default, do not have the latest AWS SDK binaries, which are required for this feature. There are no functional issues with using the generic bundle JAR, which includes these binaries.

Please complete the following steps.

Steps:

  1. D s config

  2. Locate the following parameter and set it to the value listed below:

    Code Block
    "hadoopBundleJar": "hadoop-deps/generic-hadoop/build/libs/generic-hadoop-bundle.jar"


  3. Save your changes and restart the platform.

Modify IAM policy 

The IAM policy used for S3 access must be modified to include the request permissions. When using session tags, any trust policies must have the sts:TagSession permission. Below, the previous policy has been modified to include the required elements:

Info

NOTE: The sts:TagSession permission must be added to all IAM roles that are used to connect to S3 or S3-related resources.


Code Block
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
        "arn:aws:iam::<awsAccountId>:role/instanceRole"
        ]
      },
      "Action": [
        "sts:AssumeRole",
        "sts:TagSession"
      ]
    }
  ]
}

Enable

When the above change has been applied, you can enable the feature.

Steps:

  1. D s config
    methodws


  2. Locate the following setting, and set it to Enabled:

    Code Block
    Session Tags: Enable the use of session tags when assuming an IAM Role
  3. In the following setting, specify the value that the 

    D s webapp
     should insert for the tag when requesting AWS resources:

    Code Block
    Session Tags: The name of the session tag that holds the username as its value
  4. A restart is not required.
Info

NOTE: Users should log out and login again to experience the changes in permissions due to the session tags.

User Access

After per-user authentication has been enabled, each user must provide or be provided the credentials and S3 bucket to use. Users can insert a default S3 bucket and credentials to use in their profiles. See Configure Your Access to S3.