Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Pre-requisites

  • On the Access to AWS S3 Regional Endpoints through internet protocol is required. If the machine hosting the 
    d-s-item
    itemnode
    , you must install the Oracle Java Runtime Environment for Java 1.8. Other versions of the JRE are not supported. For more information on the JRE, see http://www.oracle.com/technetwork/java/javase/downloads/index.html

Required AWS Account Permissions

All access to S3 sources occurs through a single AWS account (system mode) or through an individual user's account (user mode). For either mode, the AWS access key and secret combination must provide read and write access to the default bucket associated with the account. 

Info

NOTE: To enable viewing and browsing of all folders within a bucket, the following permissions are required:

  • The system account or individual user accounts must have the ListAllMyBuckets access permission for the bucket.
  • All objects to be browsed within the bucket must have Get access enabled.

Configuration

Depending on your S3 environment, you can define:

  • S3 as base storage layer
  • S3 access protocol method
  • read access to S3
  • S3 bucket that is the default write destination
  • access to additional S3 buckets

Define base storage layer

The base storage layer is the default platform for storing results. To enable write access to S3, you must define it as the base storage layer for your

D s item
itemdeployment
.

Warning

The base storage layer for your

D s item
iteminstance
is defined during initial installation and cannot be changed afterward.

If S3 is the base storage layer, you must also define the default storage bucket to use during initial installation, which cannot be changed at a later time. See Define default S3 write bucket below.

For more information on the various options for storage, see Storage Deployment Options.

For more information on setting the base storage layer, see Set Base Storage Layer.

S3 access protocol method

The 

D s platform
 requires use of S3a protocol to connect to S3. 

Info

NOTE: Use of s3n is not supported.

Access S3 buckets from a region using V4 signing protocol

This configuration applies to the following conditions:

Steps:

Info

NOTE: These changes should be applied in the XML file local to the

D s server
. You do not need to apply these changes in the cluster.

...

On the 

D s server
, edit the local version of core-site.xml. This file is typically located in the following directory:

Code Block
/etc/hadoop/conf

...

Locate the following configuration. Apply the location of the S3 bucket, including the geographic region:

Code Block
<property>
  <name>fs.s3a.endpoint</name>
  <value>s3.eu-west-99.amazonaws.com</value>
</property>

...

  • platform
     is in a VPC with no internet access, a VPC endpoint enabled for S3 services is required. The 
    D s platform
    does not support access to S3 through a proxy server.
  • Write access requires using S3 as the base storage layer. See Set Base Storage Layer.

    Info

    NOTE: If S3 is set as the base storage layer, you cannot publish to Hive.

Pre-requisites

Required AWS Account Permissions

All access to S3 sources occurs through a single AWS account (system mode) or through an individual user's account (user mode). For either mode, the AWS access key and secret combination must provide read and write access to the default bucket associated with the account. 

Info

NOTE: To enable viewing and browsing of all folders within a bucket, the following permissions are required:

  • The system account or individual user accounts must have the ListAllMyBuckets access permission for the bucket.
  • All objects to be browsed within the bucket must have Get access enabled.

Configuration

Depending on your S3 environment, you can define:

  • S3 as base storage layer
  • S3 access protocol method
  • read access to S3
  • S3 bucket that is the default write destination
  • access to additional S3 buckets

Define base storage layer

The base storage layer is the default platform for storing results. To enable write access to S3, you must define it as the base storage layer for your

D s item
itemdeployment
.

Warning

The base storage layer for your

D s item
iteminstance
is defined during initial installation and cannot be changed afterward.

If S3 is the base storage layer, you must also define the default storage bucket to use during initial installation, which cannot be changed at a later time. See Define default S3 write bucket below.

For more information on the various options for storage, see Storage Deployment Options.

For more information on setting the base storage layer, see Set Base Storage Layer.

S3 access protocol method

The 

D s platform
 requires use of S3a protocol to connect to S3. 

Info

NOTE: Use of s3n is not supported.

Enable read access to S3

When read access is enabled,

D s item
itemusers
 can explore S3 buckets for creating datasets. See S3 Browser.

...