This feature is no longer supported. Please do not install the |
In the Amazon AWS infrastructure, the can be deployed in a high availability failover mode across multiple modes This section describes the process for installing the platform across multiple, highly available nodes.
NOTE: This section applies to customer-managed deployments of the
|
The following limitations apply to this feature:
During installation, the platform is configured to use the same account to access AWS resources. Per-user authentication must be set up afterward.
Before you begin, please verify that you have met the following requirements.
A set of permissions must be enabled for the accounts or IAM roles used to access the bucket. For more information, see Enable S3 Access in the Configuration Guide.
To ensure sufficient database connections, the instance size must be larger than m4.large
.
NOTE: You should avoid using a default namespace. This namespace should be shared by other apps using your cluster. |
Instance types:
Tip: Instance sizes should be larger than |
Minimum | Recommended | |
---|---|---|
Cores | 8 | 16 |
RAM | 12 GB | 16 GB |
Disk space | 10 GB minimum | 10 GB minimum |
NOTE: If you are publishing to S3, additional disk space should be reserved for a higher number of concurrent users or larger data volumes. For more information on fast upload to decrease disk requirements, see Enable S3 Access in the Configuration Guide. |
For more information on installing and managing an EKS cluster, see https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html.
The following command line interfaces are referenced as part of this install process:
The following assets are available from the :
Please complete the following steps to download and configure the Docker image for use.
Steps:
Download the image file from the . Image filename should be in the following format:
trifacta-docker-image-ha.x.y.z.tar |
where: x.y.z
maps to the Release number (Release 7.6.0).
Load the image file into your ECR repository. For more information, see https://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-push-ecr-image.html.
The image file has been loaded into the repository.
Pre-requisites:
kubetcl
to configure your Kubernetes deployment on AWS.Steps:
Update the Kubernetes configuration (update-kubeconfig
):
aws eks update-kubeconfig --name <eks-cluster-name> --region <aws-region> |
where:<eks-cluster-name>
is the name of the cluster to use for the cluster.<aws-region>
is the region name where the cluster is located.
Tip: Retain the EKS cluster name and region. These values may be used later during configuration. |
Switch to the namespace in the above cluster:
NOTE: You should avoid using a default namespace. This namespace should be shared by other apps using your cluster. |
kubectl config set-context --current --namespace=<namespace> |
Verify that you are ready to use the namespace in the cluster:
kubectl get pods |
For each of the that you have installed, you must set up database credential secrets. Please use the following pattern for configuring your database secrets.
NOTE: Except for |
kubectl create secret generic db-credentials-webapp --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-scheduling-service --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-time-based-trigger-service --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-artifact-storage-service --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-authorization-service --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-configuration-service --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-job-metadata-service --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-secure-token-service --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-job-service --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-contract-service --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-orchestration-service --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-optimizer-service --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-batch-job-runner --from-literal=username=<db_username> --from-literal=password=<db_password> kubectl create secret generic db-credentials-admin --from-literal=username=<db_username> --from-literal=password=<db_password> |
where:
<db_username>
= username to access the specified database.<db_password>
= password corresponding to the specified database.Steps:
Unpack the tar file obtained from the FTP site:
untar trifacta-ha-setup-bundle-x.y.z.tar |
where: x.y.z
maps to the Release number (Release 7.6.0).
values.override.template.yaml
Create a copy of the value overrides template file:
cp values.override.template.yaml values.override.yaml |
values.override.yaml
file. Instructions are below.Example file:
# Template for minimal configuration # to get a High-availability deployment of Trifacta up and running replicaCount: 2 image: repository: "<PATH TO IMAGE_REPO>" loadBalancer: ssl: # ARN To certificate in ACM certificateARN: arn:aws:acm:XXXX:certificate/XXXXXXX nfs: conf: server: "<NFS SERVER HOST>" path: "/" logs: server: "<NFS SERVER HOST>" path: "/" database: host: "<DATABASE HOST>" port: "5432" type: postgresql triconfOverrides: "aws.accountId" : "<AWS ACCT_ID>" "aws.credentialProvider": "<AWS CRED PROVIDER>" "aws.systemIAMRole": "arn:aws:iam::XXXX:role/XXXXXX" "aws.s3.bucket.name": "<AWS S3 BUCKET NAME>" "aws.s3.key": "<AWS S3 KEY>" "aws.s3.secret": "<AWS S3 SECRET>" # Enable a fluentd Statefulset to collect application logs. fluentd: enabled: true # Specify values overrides for fluentd chart here # Enable a fluent DaemonSet to collect node, K8s dataplane and cluster logs fluentd-daemonset: enabled: false # Specify values overrides for the fluentd-daemonset chart here # Cluster details must be specified if fluentd logging is enabled global: cluster: name: "<CLUSTER NAME>" # EKS Cluster name region: "<CLUSTER REGION>" # EKS Cluster region |
Tip: Paths to values are listed below in JSON notation ( |
Value | Description |
---|---|
replicaCount | Number of replica nodes of the |
image.repository | AWS path to the ECR image repository that you created. |
By default, SSL is enabled, and a certificate is required.
SSL certificate requirements:
values.yaml
file provided in the Value | Description |
---|---|
loadBalancer.ssl.certificateARN | The ARN for the SSL certificate in the AWS Certificate Manager. |
The certificate ARN value references the ARN stored in the AWS Certificate Manager, or you can import your own certificate into ACM. For more information, see https://docs.aws.amazon.com/acm/latest/userguide/import-certificate.html.
To disable:
To disable SSL, please apply the following configuration changes:
loadBalancer: ssl: enabled: false |
The following values are used to define the locations of the mount points for storing configuration and log data.
NOTE: You should have reserved at least 10 GB for each mount point. |
Value | Description |
---|---|
nfs.conf.server | Host of the NFS server for the configuration mount point |
nfs.conf.path | On the conf server, the path to the storage area. Default is the root location. |
nfs.logs.server | Host of the NFS server for the logging mount point |
nfs.logs.path | On the conf server, the path to the storage area. Default is the root location. |
Value | Description | |
---|---|---|
database.host | Host of the Amazon RDS databases
| |
database.port | Port number through which to access the RDS databases. The default value is | |
database.type | The type of database. Please leave this value as postgresql . |
Below you can specify values that are applied to , which is the platform configuration file. For more information on these settings, see Configure for AWS in the Configuration Guide.
Value | Description | ||
---|---|---|---|
triconfOverrides.aws.accountId | The AWS account identifier to use when connecting to AWS resources. | ||
triconfOverrides.aws.credentialProvider | The type of credential provider to use for individuals authenticating to AWS resources.
Supported values:
Details are below. | ||
triconfOverrides.aws.systemIAMRole | When the credential provider is set to temporary , this value defines the system-wide IAM role to use to access AWS. | ||
triconfOverrides.aws.s3.key | When the credential is set to default , this value defines the AWS key to use for authentication. | ||
triconfOverrides.aws.s3.secret | When the credential is set to default , this value defines the AWS secret for the AWS key. | ||
triconfOverrides.aws.s3.bucket.name | The default S3 bucket to use.
|
After the platform is operational, you can apply additional configuration changes to this file through the command line or through the application. For more information, see Platform Configuration Methods in the Configuration Guide.
When enabled, a separate set of fluentd pods is launched to collect and forward .
Value | Description | |
---|---|---|
fluentd.enabled | When set to You can specify value overrides to fluentd chart in the following manner:
See | |
fluentd-daemonset.enabled | When set to |
If either of the above fluentd logging options is enabled, the following must be specified:
Value | Description |
---|---|
global.cluster.name | This value is the name of the EKS cluster that you created. |
global.cluster.region | This value is the name of the region where the EKS cluster was created. |
Optionally, you can enable fluentd to collect application logs.
Log destinations:
The logs source for fluentd logs is the .
The log destination must be configured. For more information on the fluentd output plugins, see https://www.fluentd.org/dataoutputs.
Create a logdestination.conf
configuration file containing a ConfigMap for your log destination:
kubectl create configmap fluentd-log-destination --from-file logdestination.conf |
The logdestination.conf
file must be in a fluentd configuration. Below, you can see an example logdestination.conf
file, which pushes to AWS Cloudwatch:
<label @NORMAL> <match app.*> @type cloudwatch_logs @id out_cloudwatch_logs_application region "#{ENV.fetch('REGION')}" log_group_name "/aws/containerinsights/#{ENV.fetch('CLUSTER_NAME')}/application" log_stream_name_key stream_name auto_create_stream true json_handler yajl <buffer> flush_interval 5 chunk_limit_size 2m queued_chunks_limit_size 32 retry_forever true </buffer> </match> </label> |
For more information on fluentd configuration file syntax, see https://docs.fluentd.org/configuration/config-file.
logdestination.conf
file is added as an add-on to the prepackaged fluentd configuration for the After you have configured the values override file, you can use the following command to install the deployment using helm:
helm install trifacta <trifacta-helm-package-tgz-file> --namespace <namespace> --values <path-to-values-override-file> |
where:
trifacta-helm-package-tgz-file
= the name of the Helm package that you downloaded from the namespace
= the AWS Kubernetes namespace value.path-to-values-override-file
= the path in your local environment to the values override file.Use the following command to retrieve the service URL.
NOTE: The service URL is used to access the |
kubectl get svc trifacta -o json | jq -r '.status.loadBalancer.ingress[0].hostname' |
Copy and paste the service URL into a supported version of a supported web browser. For more information on supported web browsers, see Desktop Requirements in the Planning Guide.
Tip: You can map CNAME/ALIAS against this service URL through Route53 configurations. For more information, see https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/getting-started.html. |
The login screen for the should be displayed. Login to the application using the admin credentials.
You should change the administrator password as soon as you log in. For more information, see Change Admin Password in the Admin Guide. |
For more information, see Login.
Scale the number of pods through kubectl:
kubectl scale statefulset trifacta --replicas=<Desired number of pods> |
Restart the through kubetcl:
kubectl rollout restart statefulset trifacta |
Use the following to delete :
kubectl delete statefulset trifacta |
By default, Amazon RDS performs periodic backups of your installed databases.
For more information on manual backup of the databases, see Backup and Recovery in the Admin Guide.
For more information on backing up your EFS mounts through AWS, see https://docs.aws.amazon.com/efs/latest/ug/awsbackup.html.
You must configure access to S3.
NOTE: If you are publishing to S3, 50 GB or more is recommended for storage per node. Additional disk space should be reserved for a higher number of concurrent users or larger data volumes. You can also enable fast upload to decrease disk requirements. |
For more information, see Enable S3 Access in the Configuration Guide.
S3 must be set as the base storage layer. For more information, see Set Base Storage Layer in the Configuration Guide.
When the platform is first installed, a temporary license is provided. This license key must be replaced by the license key that was provided to you. For more information, see License Key in the Admin Guide.
Additional configuration is required to enable the to run jobs on the EMR cluster. For more information, see Configure for EMR in the Configuration Guide.