Before you deploy the , you should complete the following configuration steps within your Hadoop environment.
NOTE: The |
The interacts with Hadoop through a single system user account. A user for the platform must be added to the cluster.
NOTE: In a cluster without Kerberos or SSO user management, the If LDAP is enabled, the If Kerberos is enabled, the |
For POSIX-compliant Hadoop environments, the user IDs of the |
UserID:
If possible, please create the user ID as:
This user should belong to the group:
User requirements:
Verify that the following HDFS paths have been created and that their permissions enable access to the account:
NOTE: Depending on your Hadoop distribution, you may need to modify the following commands to use the Hadoop client installed on the |
Below, change the values for to match the
user for your environment:
hdfs dfs -mkdir /trifacta hdfs dfs -chown trifacta /trifacta hdfs dfs -mkdir -p /user/trifacta hdfs dfs -chown trifacta /user/trifacta |
The following directories must be available to the on HDFS. Below, you can review the minimum permissions set for basic and impersonated authentication for each default directory. Secure impersonation is described later.
NOTE: Except for the |
Directory | Minimum required permissions | Secure impersonation permissions |
---|---|---|
/trifacta/uploads | 700 | 770 Set this to 730 to prevent users from browsing this directory. |
/trifacta/queryResults | 700 | 770 |
/trifacta/dictionaries | 700 | 770 |
/trifacta/tempfiles | 770 | 770 |
You can use the following commands to configure permissions on these directories. Following permissions scheme reflects the secure impersonation permissions in the above table:
$ hdfs dfs -mkdir -p /trifacta/uploads $ hdfs dfs -mkdir -p /trifacta/queryResults $ hdfs dfs -mkdir -p /trifacta/dictionaries $ hdfs dfs -mkdir -p /trifacta/tempfiles $ hdfs dfs -chown -R trifacta:trifacta /trifacta $ hdfs dfs -chmod -R 770 /trifacta $ hdfs dfs -chmod -R 730 /trifacta/uploads |
If these standard locations cannot be used, you can configure the HDFS paths.
"hdfs.pathsConfig.fileUpload": "/trifacta/uploads", "hdfs.pathsConfig.batchResults": "/trifacta/queryResults", "hdfs.pathsConfig.dictionaries": "/trifacta/dictionaries", |
The supports Kerberos authentication on Hadoop.
NOTE: If Kerberos is enabled for the Hadoop cluster, the keytab file must be made accessible to the |
The Hadoop cluster configuration files must be made available to the . You can either copy the files over from the cluster or create a local symlink to them.
For more information, see Configure for Hadoop.