AWS private data processing involves running a data processing cluster for Alteryx Analytics Cloud (AAC) inside of your AWS account and VPC. This combination of software, your infrastructure, and AWS resources managed by Alteryx, is referred to as a private data plane. This page focuses on how to set up your AWS account and VPC for AACAAC to create a private data plane there.
Nota
The AWS Account and VPC setup requires access and permissions to the AWS Console. If you don’t have this access, contact your IT team to complete this step.
Aviso
Never delete resources provisioned for Private Data Processing.
Select the account where you want to run your private data plane.
Because IAM credentials are scoped to the entire account, the most secure way to run a private data plane is in a dedicated AWS account. This is not required but recommended.
You probably want this account to be in the same region as the S3 bucket you selected for private data storage, as well as any data sources you want to connect to AACAAC. This improves performance and reduces egress costs.
The VPC created in the AWS account should be dedicated to AACAAC. You can set up connectivity to private data sources using VPC peering, transit gateways,
With your AWS account in place, the next step is to set up the IAM user account and access keys
Key Name | Value |
---|---|
AACResource | aac_iam_user |
Select the new IAM user and then select the Security credentials tab.
Select Create access key.
Select Other under Access key best practices & alternatives and then select Next.
Select Create access key.
Nota
You need the IAM user access key and secret key later when you provision the cloud resources and deploy software.
Create an IAM user with the name
aac_automation_sa
. Ensure that this user doesn't have console access.On Set Permissions, select Next.
Tag the IAM user:
Select Create User.
Generate an Access Key...
You need to create a custom IAM policy. Name it AAC_Base_SA_Policy
and use the following policy document. We recommend using the JSON tab instead of the visual editor. AACAAC requires some * permissions to run. Expect some security warnings when you create the policy.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor2",
"Effect": "Allow",
"Action": [
"iam:GetOpenIDConnectProvider",
"iam:GetPolicy",
"iam:GetPolicyVersion",
"iam:GetRole",
"iam:GetRolePolicy",
"iam:GetUser",
"iam:GetUserPolicy",
"iam:ListAttachedRolePolicies",
"iam:ListAttachedUserPolicies",
"iam:ListGroupsForUser",
"iam:ListInstanceProfilesForRole",
"iam:ListPolicyTags",
"iam:ListPolicyVersions",
"iam:ListRolePolicies"
],
"Resource": [
"arn:aws:iam::*:policy/*",
"arn:aws:iam::*:oidc-provider/*",
"arn:aws:iam::*:user/*",
"arn:aws:iam::*:role/*"
]
},
{
"Sid": "VisualEditor3",
"Effect": "Allow",
"Action": [
"elasticloadbalancing:*",
"iam:GetAccountName",
"iam:ListAccountAliases",
"iam:ListRoles",
"networkmanager:Describe*",
"networkmanager:Get*",
"networkmanager:List*",
"s3:GetBucketAcl",
"s3:GetBucketCORS",
"s3:GetBucketLocation",
"s3:GetBucketLogging",
"s3:GetBucketObjectLockConfiguration",
"s3:GetBucketOwnershipControls",
"s3:GetBucketPolicy",
"s3:GetBucketPolicyStatus",
"s3:GetBucketPublicAccessBlock",
"s3:GetBucketRequestPayment",
"s3:GetBucketTagging",
"s3:GetBucketVersioning",
"s3:GetBucketWebsite",
"s3:GetEncryptionConfiguration",
"s3:GetLifecycleConfiguration",
"s3:GetObject",
"s3:GetObjectAcl",
"s3:GetObjectVersion",
"s3:GetObjectVersionAcl",
"s3:GetObjectVersionAttributes",
"s3:GetObjectVersionForReplication",
"s3:GetObjectVersionTagging",
"s3:GetObjectVersionTorrent",
"s3:GetReplicationConfiguration",
"s3:ListAllMyBuckets",
"s3:ListBucket",
"s3:ListBucketVersions",
"sts:GetCallerIdentity"
],
"Resource": "*"
},
{
"Sid": "VisualEditor4",
"Effect": "Allow",
"Action": "secretsmanager:*",
"Resource": "arn:aws:secretsmanager:*:*:secret:*"
}
]
}
Tag the custom IAM policy created in Step 2b.
Key Name
Value
AACResource
aac_sa_custom_policy
Attach the
AAC_Base_SA_Policy
IAM policy to theaac_automation_sa
service account created in Step 2a.Nota
AAC_Base_SA_Policy
is an example policy name. You can choose any name for the policy, but the name must start withAAC_Base
.
Create the VPC after you create the IAM policy...
Create a new VPC in 1 of the supported regions. For information on supported regions, go to Private Data Processing.
Select VPC and more.
Configure CIDR blocks in the VPC. You might need to create the VPC with a single CIDR and then select Edit CIDRs to add the second.
For Designer Cloud and Machine Learning, add
/18
and/21
CIDRs.For Cloud Execution for Desktop, add
/21
CIDR.
Select 3 in the Number of Availability Zones (AZs) section.
Select 0 in the Number of public subnets section.
Select 0 in the Number of private subnets section.
Select None in the NAT gateways section.
Enable the S3 Gateway VPC endpoint within the VPC.
Enable DNS hostnames and resolution.
Tag the VPC.
Tag Name | Value |
---|---|
AACResource | aac_vpc |
Nota
Connections to private data sources require network paths between the VPC and the data source. As defined in the shared responsibility matrix, you set up these network paths in accordance with your own network policies and preferences.
If your network setup requires usage of a transit gateway or internet gateway, set up and tag them now.
Tag Name | Value |
---|---|
AACResource | aac |
Your data source credentials are encrypted using your key and securely stored in a private vault (Private Credential Storage) within your private data plane account. These credentials are only retrieved from the vault when you need them.
Go to Key Management Services and select Create Key.
Select Key Type > Symmetric.
Select Key Usage > Encrypt and Decrypt.
Name the key
aac-credentials-vault-key
.Tag the key.
Tag Name
Value
AACResource
aac
For Key Administrator select Next.
For Key Usage Permissions select Next.
Select Finish.
Nota
The ARN key is provided in the Private Data Processing page.
By default, each AWS account has a maximum limit on the number of vCPUs running simultaneously. To run your private environment, you need to request an increase in the quota. This allows your environment to scale up to meet your demand.
The exact number you specify will depend on several factors, such as how many applications you are running, your tolerance level for waiting around for jobs to run if there isn’t enough hardware, and your willingness to pay for more infrastructure so you can spend less time waiting for job runs.
Below, you can see recommended numbers for running analytics workloads on Alteryx Analytics Cloud. You may choose a different number. For example, you may not want to scale that high to limit the cost of the environment. Or you may want to increase it because you are running multiple Analytics Cloud applications and want to ensure that hardware is always available.
Alteryx recommends requesting a quota increase for the following quotas:
This quota applies to the Kubernetes cluster that runs Designer Cloud, Machine Learning, Auto Insights, and App Builder. It also applies to the autoscaling group of VMs that power Cloud Execution for Desktop.
Our recommendation for this quota is based on allowing your cluster and autoscale groups to consume up to 50 nodes.
Quota Name: Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances.
Quota Description: Maximum number of vCPUs assigned to the Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances.
AWS default quota value: 5
Recommended minimum quota value: 800 (50 nodes x 16 vCPU/node).
This quota applies only if you plan to deploy EMR as a Designer Cloud processing option.
Quota Name: Max concurrent vCPUs per account.
Quota Description: Maximum number of vCPUs that can be concurrently run in this account in the current Region.
AWS default quota value: 16
Applied quota value: 1024
Sign in to the AWS account console.
Search for Service Quotas and select the service.
Select AWS Service from left navigation pane.
Search for the service (for example, Amazon EMR Serverless or Amazon EC2).
Select the quota name.
Select Request quota increase.
Request the specified quota increase.
Aviso
Changing or removing any AAC-provisioned public cloud resources after Private Data Handling has been set up can cause inconsistencies. These inconsistencies may lead to errors during job execution or when deprovisioning the Private Data Handling setup.
Data processing environment provisioning triggers from the Admin Console in AACAAC. You need Workspace Admin privileges within a workspace in order to see it.
On the AACAAC landing page, select the circle icon in the top right with your initials in it. Select Admin Console from the menu.
Select Private Data Handling from the left navigation menu.
Make sure that Private Data Storage shows Successfully Configured
before you proceed. If the status is Not Configured
, go to AWS S3 as Private Data Storage first, then return to this step.
Under the Private Data Processing section, you need to fill out 5 fields. These values come from the AWS account and VPC setup steps you just completed.
Nota
The ARN key is provided in the Private Data Processing page in Step 5: Create a Symmetric Key for Secure Vault.
Selecting Create will trigger a set of validation checks to verify the AWS account is setup as needed. If permissions are not configured correctly, or the Vnet resources are not created or tagged correctly, you’ll receive an error message with a description that should point you in the right direction.
Cloud Execution for Desktop in AWS