Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r097

...

Excerpt

When you sign up to use use the

d-s-productplatform
rtrue
, you are provided a default storage environment which you can use to immediately get started using the product. A storage environment is used to store your data assets. 


The

d-s-productplatform
 supports multiple storage environments, one of which is the default one. The default storage environment provides all of the storage capabilities of other storage environment, as well as storage for data assets generated by use of the product. In the following table, you can see the types of data assets that are stored in each type of storage environment:

Asset TypeDescriptionDefault StorageNon-Default Storage
imported datasets

When you upload data to the product, it is stored in the default storage environment. From these uploaded assets, you create imported datasets, which are sources of data

for your transformation recipes

in the

D s platform
.

You can also import data that is stored in other storage environments.

YesYes
job resultsWhen you run a job to transform your data, the results of the job execution are stored in a storage environment.YesYes
samples

When you

create transformation recipes

transform your data, you are working on a sample of the source data.

  • When you first open a recipe, an initial sample of the recipe is created for you and stored on the default storage environment.
  • As needed, you can start a job to create a new sample of your data. Ad-hoc samples are also stored on the default storage environment.
  • For more information, see Overview of Sampling.

 

YesNo
temporary files

During ingestion of data and job execution, the

d-s-

product

platform

requires

 requires storage space in the default storage environment to store temporary files.

YesNo

...

Storage Environment Options

...

Tip

Tip: You can switch between

D s tfs
and S3 as your default storage environment at any time without disruption to service or loss of data.

Storage EnvironmentDescription

D s tfs

Short for

D s tfs
typefull
, this S3-backed storage environment is managed by
D s company
and requires no additional configuration to manage. It is available as soon as your launch the product for the first time. Details are below.

When 

When the

d-s-

product

platform
 is first launched, 
D s tfs
 is  defined as the default storage environment. This storage environment provides storage for the above data asset types. 
D s tfs
typefull
  is backed by AWS S3 buckets hosted by 
D s company
 and secured by IAM policies.

Using  

D s tfs
  is very similar to navigating S3 buckets to find and select assets to import or to locate job results that you have published. For more information, see  Using TFS.

S3

Your S3 buckets and their assets.

Info

NOTE: To access your S3 assets, you must provide authentication credentials, policies, and other configuration information to the

D s webapp
typePortal
. Additional information is provided below.

See Using S3.

Configure Storage

By default:

...

Info

NOTE: If S3 has been enabled previously, all access to assets stored on S3 is cut off for the

D s webapp
typePortal
.

To disable S3 access, please complete the following.

...

Info

NOTE: If

D s tfs
has been enabled previously, all access to assets stored on
D s tfs
is cut off for the
D s webapp
typePortal
.

Steps:

  1. Login as a workspace administrator.
  2. D s config
    methodws
  3. Locate the following setting and set it to Enabled:

    Code Block
    Enable S3 connectivity
  4. Set the following to S3:

    Code Block
    Default storage environment
  5. Locate the following setting and set it to Disabled:

    Code Block
    Trifacta File System
  6. Access to S3 is closed. 
    D s tfs
     is used as the default storage environment.

...

  1. Login as a workspace administrator.
  2. In the the

    D s webapp
    typePortal
    , select User menu > Admin console > AWS Settings.  

  3. For Mode, select your preferred access mode:

    1. All users in the workspace share the same AWS credentials: workspace mode

    2. Each user in the workspace can use their own AWS credentials: user mode

  4. For Authentication Method, you must determine your choice:

    Info

    NOTE: The authentication method requires information from S3. Specific requirements and configuration are covered in later steps.

    1. Use a cross-account role (IAM role): An IAM role is an AWS object that contains policies defining permissions and access level to AWS and S3 resources. This object must be created by an AWS or S3 administrator. This method of access is recommended.
    2. Use access keys: These key-secret combinations can be used to provide access to S3 buckets. 

...

For more information, see AWS Settings Config Page.

Tip

Tip: You can also perform this configuration by creating an Amazon S3 connection in the Import Data Page. This connection creates the global connection to S3 for all workspace users. See Amazon S3 Connections.

...

D s ed
editionsawsent

  1. Workspace administrators who who have chosen to use IAM roles must ensure that any IAM role includes the proper trust relationship for the the
    D s webapp
    typePortal
    . For more information, see Insert Trust Relationship in AWS IAM Role.
  2. Workspace users must configure their access.
    1. For more information on configuring access, see Configure Your Access to S3. 
    2. Individual users can also configure directories to use in their S3 bucket for storing assets. See Storage Config Page.

D s also
inCQLtrue
label((label = "storage") AND (label = "storage"))