Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The 
    D s platform
     is integrated with an EMR cluster:
    • EMR version 5.8.0 or later
    • EMR cluster has been configured with HiveServer2
  • The Hive deployment must be integrated with AWS Glue.

    Info

    NOTE: Hive connections are supported when S3 is the backend datastore.


  • For HiveServer2 connectivity, the 
    D s node
     has direct access to the Master node of the EMR cluster.
  • Hive metastore must be configured to use AWS Glue
  • For Hive on AWS EMR to access AWS Glue, EMR roles assigned to the cluster should have the AWS Glue functions in their IAM roles. For more information, see https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-iam-roles-defaultroles.html#emr-iam-contents-ec2role.

Required Glue table properties

Each Glue table must be created with the following properties specified:

  • InputFormat
  • OutputFormat
  • Serde 

These properties must be specified for the Hive JDBC driver to read the Glue tables.

For additional limitations on access Hive tables through Glue, see https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive-metastore-glue.html#emr-hive-glue-considerations-hive.

Limitations

  • Access is read-only. Publishing to Hive hosted on EMR is not supported.
  • You cannot select datasets through the Database browser in the 

    D s webapp
    .

    Info

    NOTE: Use of this integration requires the development of custom SQL queries against the AWS Glue metadata store.


...