Private Data Handling
Private data handling is a capability in Alteryx Analytics Cloud that allows you to store your data and run data processing jobs in your own cloud infrastructure. Private data handling provides more security and control for those with sensitive data. It also results in improved performance and reduced egress costs by moving the processing in Alteryx Analytics Cloud next to your data.
Overview
At the highest level, Alteryx Analytics Cloud differentiates between customer data and application metadata.
Customer data belongs to you. This is the data that you want to join and merge, prep and blend, analyze, train models on, and perform other kinds of analytic processing.
Application metadata is the data that Alteryx Analytics Cloud needs to do the jobs you ask it to do.
Alteryx Analytics Cloud uses a split-plane architecture and has divided responsibility for these 2 kinds of data into different planes to provide more flexibility to customers. These 2 planes are the control plane and the data plane.
Plane | Description |
---|---|
Control Plane | The control plane powers the user's design time experience, acts as the command and control center, and stores application metadata. |
Data Plane | The data plane is responsible for these aspects of processing and storage of customer data:
Note Third-party datastores and execution engines such as data warehouses are also part of the data plane. |
Private data handling allows you to run a data plane inside your own AWS VPC, giving you control over where you store and process your data. It is comprised of 2 components:
Private data storage: Use Alteryx Analytics Cloud to leverage your existing AWS infrastructure for the at-rest storage of platform metadata and other assets.
Private data processing: Use Alteryx Analytics Cloud to run your own data processing resources for the execution of data processing activities including connecting to data sources, processing data, converting data from one format to another, and publishing job outputs. This arrangement ensures that no data leaves your VPC.
Private Data Handling includes defense-in-depth security controls to protect your data assets and meet compliance requirements. If desired, you can also increase security by putting firewall/IP restrictions to only allow ingress from the Alteryx Analytics Cloud control plane.
Feature availability:
Feature | Availability |
---|---|
Private Data Storage |
|
Private Data Processing |
|
Architecture
When you choose to use private data handling, the Alteryx Analytics Cloud control plane will initiate interactions with your S3 bucket (private data storage), a Kubernetes cluster managed by Alteryx, and Amazon’s EMR serverless product—all running inside of your VPC. Some of these interactions are for command and control. Others are data paths to move your data from one place to another as directed.
![]() |
Design-time
A user interacts with Alteryx Analytics Cloud through a web application in their web browser. Most design-time activities including adding tools to the canvas and live processing of results take place entirely within the browser. These are notable exceptions:
Upload a file for input datasets.
Retrieve a sample of data.
Save a workflow.
Alteryx Analytics Cloud never stores or caches customer data within the control plane. Note that some light processing (delimiter and header detection, column name and type inference, and the transform by example capability) might occur within the control plane.
Data Security
Alteryx offers a downloadable whitepaper that covers private data handling privacy and security in depth. You can find a link to this document at alteryx.com/trust in the Private Data Handling section.
For convenience, these are a few highlights for encryption of data in transit and at rest:
Data is TLS 1.3 encrypted when in transit between browser <=> control plane and control plane <=> data plane.
Alteryx uses mTLS encryption for intra-cluster communications.
A database in the control plane encrypted with 256-bit AES block ciphers stores file storage and database credentials.
Alteryx applies envelope encryption to these credentials before they pass from the control plane to the data plane. In the data plane, these credentials become available to job pods as Kubernetes secrets.
AWS Secrets Manager stores the private key used to decrypt the encrypted credentials. The AWS Secrets Manager resides in the data plane and is mounted into the EKS cluster using the external secrets operator.
EKS workloads access secrets in AWS Secrets Manager through a Kubernetes ServiceAccount annotated with an IAM Role that has the
GetSecretValue
permission.
Workloads on EKS get access to secrets in AWS Secret Manager through a Kubernetes ServiceAccount annotated with an IAM Role that has the
GetSecretValue
permission.
Upgrades
Not having to worry about upgrades is one of the benefits of software-as-a-service. Alteryx Analytics Cloud manages upgrades for you.
Alteryx Analytics Cloud manages software upgrades for long-running services. When new versions of the software are available, Alteryx pushes new container images to our image repositories. Alteryx Analytics Cloud retrieves these new image versions and seamlessly begins using them within the cluster without disrupting any running jobs.
Alteryx also manages infrastructure upgrades on your behalf.
Metrics Collection
Alteryx Analytics Cloud uses Datadog to collect application monitoring usage data to monitor and maintain operational stability. The Datadog agent deploys in every node in the EKS cluster. To bundle the Datadog agent in the AMI used to deploy EC2 instances. select compatibility-mode execution. Datadog agent collects these metrics:
Telemetry Metrics from the EKS cluster, S3, EMR Serverless, and EC2.
Kubernetes monitoring.
Custom logs from the services in the EKS.
Cloudwatch logs for the public cloud-managed services (for example EMR and S3) used by products in the Alteryx Analytics Cloud.
How to Enable Private Data Handling
Private data handling consists of 2 capabilities: private data storage and private data processing. You’ll need to configure private data storage first, then private data processing second. Because private data processing requires running a data processing cluster inside your VPC, it'll require additional VPC setup.
Private Data Storage
In this step, configure workspace storage in your own VPC, set that as the default storage environment, and disable Alteryx Data Storage (ADS).
Anything saved in ADS will be inaccessible after doing this. If possible, this should be done before users upload any assets.
For more details, go to Private Data Storage.
Private Data Processing
In this step, select a region and an account. Then create a VPC, subnets, route tables, and the IAM permissions that allow the Alteryx Analytics Cloud to create and manage the data processing infrastructure and software.
Alteryx recommends using a dedicated account and VPC for the best security and stability, although other configurations are possible. Alteryx also provides some CloudFormation templates that you can use to assist in this step.
For more details, go to Setup AWS Account and VPC for Private Data.
In this step, grant Alteryx Analytics Cloud permission to spin up the cluster, kick off the provisioning process, and update a trust relationship between your new processing cluster and your private data storage bucket.
For more details, go to Private Data Processing.
Known Limitations
There are some known limitations to private data processing:
Designer Cloud, Machine Learning, and Reporting are compatible with private data processing. Other early access applications such as App Builder are not yet compatible and show as disabled in a workspace where you've enabled private data processing.
Using SSH Tunneling with connectors is not yet supported in a workspace with private data processing.
S3 User Mode is not yet supported when you've enabled private data processing.