Simple Storage Service (S3) is an online data storage service provided by Amazon, which provides low-latency access through web services. For more information, see https://aws.amazon.com/s3/.
Before you begin, please verify that your Trifacta® environment meets the following requirements:
Integration: Your Trifacta instance is connected to a running environment supported by your product edition.
Multiple region: Multiple S3 connections can be configured in different regions.
Enable S3 Connectivityhas been enabled in the Workspace Settings Page.
- Acquire the Access Key ID and Secret Key for the S3 bucket or buckets to which you are connecting. For more information on acquiring your key/secret combination, contact your S3 administrator.
Access to S3 requires:
Each user must have appropriate permissions to access S3.
NOTE: If a user does not have write permissions to the specified S3 bucket, publishing jobs to the bucket fail.
- To browse multiple buckets through a single S3 connection, additional permissions are required. See below.
- Authentication using IAM roles is not supported.
- Automatic region detection in the create and edit connection is not supported.
Publishing the output to multi-part files is not supported.
NOTE: For some file formats, like Parquet, multi-part files are the default output.
Publishing the output using compression option is not supported for Trifacta Photon jobs.
Workaround: If you need to generate an output using compression to this S3 bucket, you can run the job on another running environment.
You can create additional S3 connections by the following method:
Create through application
You can create a S3 connection through the application.
- Login to the application.
- In the left navigation bar, click the Connections icon.
In the Create Connection page, click the External Amazon S3 card.
Specify the connection properties:
Property Description DefaultBucket
(Optional) The default S3 bucket to which to connect. When the connection is first accessed for browsing, the contents of this bucket are displayed.
If this value is not provided, then the list of available buckets based on the key/secret combination is displayed when browsing through the connection.
NOTE: To see the list of available buckets, the connecting user must have the getBucketList permission. If that permission is not present and no default bucket is listed, then the user cannot browse S3.
Access Key ID
Access Key ID for the S3 connection.
Secret Key for the S3 connection.
Server Side Encryption If server-side encryption has been enabled on your bucket, you can select the server-side encryption policy to use when writing to the bucket. SSE-S3 and SSE-KMS methods are supported. For more information, see http://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-encryption.html. Server Side Kms key Id
When KMS encryption is enabled, you must specify the AWS KMS key ID to use for the server-side encryption. For more information, see "Server Side KMS Key Identifier" below.
For more information on the other options, see Create Connection Window.
Server Side KMS Key Identifier
When KMS encryption is enabled, you must specify the AWS KMS key ID to use for the server-side encryption.
- Access to the key:
- Access must be provided to the authenticating user.
The AWS IAM role must be assigned to this key.
- Encrypt/Decrypt permissions for the specified KMS key ID:
- Permissions must be assigned to the authenticating user.
The AWS IAM role must be given these permissions.
- For more information, see https://docs.aws.amazon.com/kms/latest/developerguide/key-policy-modifying.html .
The format for referencing this key is the following:
You can use an AWS alias in the following formats. The format of the AWS-managed alias is the following:
The format for a custom alias is the following:
<FSR> is the name of the alias for the entire key.
Create via API
For more information on the vendor and type information to use, see Connection Types.
Java VFS Service
The Java VFS Service has been modified to handle an optional connection ID, enabling S3 URLs with connection ID and credentials. The other connection details are fetched through the Trifacta application to create the required URL and configuration.
You can publish results to your external S3 buckets. Configure an output destination to write to your external S3 bucket.
- In Flow View, create or edit an output object.
- To edit, right-click an output object. The object details are displayed in the Details panel.
- In the Details panel, click Edit.
- Modify the output destination to use the External S3 buckets connection.
- Navigate the bucket to select the appropriate location for the output. Specify the file as needed.
- To save your changes, click Update.
For more information, see Create Outputs.
- Import a dataset from External Amazon S3.
- Add it to a flow and run a job, publishing results back to S3.
For more information, see Verify Operations.
This page has no comments.