Contents:
IAM roles
Before you begin, your IAM roles should be defined and attached to the associated EC2 instance.
NOTE: The IAM instance role used for S3 access should have access to resources at the bucket level.
For more information, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html.
AWS System Mode
To enable role-based instance authentication, the following parameter must be enabled.
"aws.mode": "system",
Additional AWS Configuration
The following additional parameters must be specified:
Parameter | Description |
---|---|
aws.credentialProvider | Set this value to instance . IAM instance role is used for providing access. |
aws.hadoopFsUseSharedInstanceProvider | Set this value to |
Shared instance provider class information
Hortonworks:
"com.amazonaws.auth.InstanceProfileCredentialsProvider",
"org.apache.hadoop.fs.s3a.SharedInstanceProfileCredentialsProvider"
In the future:
CDH is moving back to using the Instance
class in a future release. For details, see https://issues.apache.org/jira/browse/HADOOP-14301.
Use of S3 Sources
To access S3 for storage, additional configuration for S3 may be required.
NOTE: Do not configure the properties that apply to user
mode.
Output sizing recommendations:
- Single-file output: If you are generating a single file, you should try to keep its size under 1 GB.
- Multi-part output: For multiple-file outputs, each part file should be under 1 GB in size.
- For more information, see https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-use-multiple-files.html
See Enable S3 Access.
This page has no comments.