This section describes how to ensure that the is configured correctly to connect to Hive when Sentry is enabled for Hive. Sentry provides role-based authorization for Hive and other Hadoop components on the Cloudera platform.
Prerequisites
Secure Impersonation with |
NOTE: Although not required, secure impersonation is highly recommended for connecting the platform with Hive. |
The requires the following additional configuration changes to maintain secure impersonation and work with Hive data:
Give the local Hive user access to the Unix or LDAP group .
Set the following umask in :
"hdfs.permissions.userUmask" = 027, |
For Sentry, the following definitions and relationships apply.
Definition | Description |
---|---|
User | Individual account, as identified by the underlying authentication system |
Group | A set of users maintained by the authentication system |
Role | A set of privileges stored as a template to combine multiple access rules |
Privilege | An instruction or rule allowing access to an object. Examples of Privileges include access to databases, tables, and the operations that can be executed. |
In Sentry:
NOTE: Before you begin, you should determine the privileges that must be granted to |
NOTE: If you are publishing back to Hive, please verify that one of the following is enabled:
|
Steps:
Create a role for users of the :
CREATE ROLE trifactaUserRole; |
Grant that role to the group associated with the platform:
GRANT ROLE trifactaUserRole TO GROUP trifacta; |
Grant all privileges to this role for the filesystem area under which platform output is generated. The full URI is required. Example:
NOTE: Modify the grants as needed for your environment. |
GRANT ALL ON URI 'hdfs://domain_example:8020/trifacta/queryResults/user1@example.com/' to ROLE trifactaUserRole; |
NOTE: If the above URI changes, the above grant must be reapplied to the new URI. |
When the is enabled with secure impersonation and submits requests to Hive, the following steps occur:
The Hive server authorizes access to the underlying table through Sentry as the Hadoop principal user assigned to the .
NOTE: This Hadoop principal is the user that should be configured with appropriate privileges and roles in Sentry. |
hive
, which should be part of the designated group
NOTE: Since Sentry assigns privileges and roles to Unix groups, a common practice is to assign the Hadoop principal users (used by |
NOTE: In UNIX environments, usernames and group names are case-sensitive. Please verify that you are using the case-sensitive names for users and groups in your Hadoop configuration and |
After you have completed your configuration changes, you should restart the platform. See Start and Stop the Platform.
To verify platform operations, run a simple job. For more information, see Verify Operations.
If you have deployed Sentry to manage access to a Kerberized environment using secure impersonation, you may encounter the following error when trying to write your results back to the Hadoop cluster:
NOTE: This issue is known to appear in Cloudera 5.7. It may not appear in later releases. |
2015-09-02 20:49:54.111Z - WARN : com.trifacta.dataservice.Controller : Bad Request: org.springframework.jdbc.BadSqlGrammarException: StatementCallback; bad SQL grammar [CREATE TABLE `test_trifacta` ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION 'hdfs://domain_example:8020/trifacta/queryResults/user1@example.com/test/143/original_98.avro' TBLPROPERTIES ('avro.schema.literal'='{"type":"record","name":"GenericTrifactaRecord","fields":[{"name":"name","type":["null","string"]},{"name":"id","type":["null","string"]},{"name":"id2","type":["null","long"]},{"name":"randomname","type":["null","string"]},{"name":"description","type":["null","string"]},{"name":"dob","type":["null","string"]},{"name":"title","type":["null","string"]},{"name":"corp","type":["null","string"]},{"name":"fixedone","type":["null","long"]},{"name":"fixedtwo","type":["null","long"]}]}')]; nested exception is org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException No valid privileges Required privileges for this query: Server=server1->URI=hdfs://domain_example:8020/trifacta/queryResults/user1@example.com/test/143/original_98.avro->action=*; |
In this case, Sentry is failing to validate the URI permissions to allow the user (user1@example.com
) to access the HDFS path, as the permissions have not been specifically granted to the required role. Sentry queries for authorization, fails, and throws the above exception.
The solution is to grant all access privileges for the user's Sentry role to for the target user. In the following example, access is granted to the
role2
role:
GRANT ALL ON URI 'hdfs://domain_example:8020/trifacta/queryResults/user1@example.com/' to ROLE role2; |
Since permissions in Sentry are recursive through the directories, the target directory for the specific job is covered. For more information on Sentry permissions, See Terminologies section in http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_sg_sentry.html.