Page tree

Release 5.0.1




This section describes the basics of creating a Hive connection and modifying it for specific aspects of your Hadoop environment.


NOTE: The platform supports a single, global connection to Hive. All users must use this connection.

For more information on how the platform works with Hive, see Using Hive.

Create Encryption Key File

Before you create your Hive connection, you must create and reference an encryption key file. If you have not created one already, see Create Encryption Key File.

Create Hive Connection

You can create the connection through the application, the command line interface, or the APIs.

Create through application

Alteryx administrator can create the Hive connection through the application.


  1. Login to the application.
  2. In the menu, click Datasets.
  3. Click Import Data.
  4. In the left nav bar, click the New icon. 
  5. In the Create Connection page, click the Hive connection card.
  6. Specify the properties for your Hive database. 

    1. Before you create the connection, you should review the rest of this section to verify that you are using the proper connection string options for the Hive connection to work in your environment. 

    2. For more information, see Create Connection Window.

  7. Click Save

Create through CLI

This connection can also be created through the Alteryx command line interface. Notes on creating the connection:

  • If you are upgrading to Release 3.1 or later and had enabled Hive connectivity in your earlier version, your connection is still valid. If you need to make changes to it, you must execute them through the command line interface. See CLI for Connections.
  • If you are creating a Hive connection in an environment where HTTP is deployed or security features such as SSL, user impersonation, or Kerberos are deployed, please be sure to read the remainder of this section before creating the connection.
  • This connection must be created by a Alteryx administrator and must be created as a global connection. You can create only one connection of this type.
  • By default, the platform connects to Hive over TCP. If you need to connect over HTTP, additional configuration is required as part of the create connection step. See Configure for HTTP below.

The CLI tools are stored in the following directory:


Example command (all one command):

./ create_connection --user_name <trifacta_admin_username> --password <trifacta_admin_password> 
--conn_type hadoop_hive --conn_name aHiveConnection 
--conn_description "This is my Hive connection." 
--conn_host --conn_port 10000 
--conn_credential_type trifacta_service --conn_params_location ~/.trifacta/p.json 
--conn_is_global --cli_output_path ./conn_create.out
create_connectionCLI action type. Please leave this value as create_connection.

Username of the Alteryx administrator account to use. Please specify the full username.

 For this connection type, it must be an admin user account.

--passwordPassword of the account.
--conn_typeThe type of connection. Set this value to hadoop_hive.

The internal name of this connection.

NOTE: This value must be unique within all of your connections.

--conn_descriptionA user-friendly description for this connection appears in the application.
--conn_hostHost of the Hive instance.

Port number of the Hive instance.

TCP connection: 10000

HTTP connection: 10001

NOTE: Additional configuration is required for enabling an HTTP or HTTPS connection. See Configure Use of HTTP below.

--conn_credential_typeThe type of credentials to use. Set this value to trifacta_service for Hive.

For Alteryx Server type of credentials, a credential file is not used.

--conn_params_locationPath to the file containing the parameters to pass to Hive during interactions. See below.
--conn_ is_global

This flag is required. It makes the connection available to all Alteryx users. This flag requires no value.

NOTE: After a connection has been made global, it cannot be made private. You must delete the connection through the CLI and then recreate it.

You cannot delete a connection if it has imported datasets associated with it. You can delete a connection without datasets, but the publication information associated with the connection is dropped.


By default, any connection is tested as part of a create or edit action. Include this flag to skip testing the connection. This flag requires no value.

Tip: After creation, you can test the connection and modify it through the application. See Flows Page.


The path to the file where results of this command are written.

NOTE: The user issuing the command must also have execute permissions on all parent folders in the specified cli_output_path.

Example Params file:

A parameter file containing the following information must be accessible within the Alteryx deployment directory. Please add a reference to this file in your command.

Some deployment scenarios require that specific values be passed into the ConnectStrOpts parameter for the connection to work. Please be sure to read the optional configuration sections below.


  "jdbc": "hive2"

For more information on the Alteryx command line interface, see CLI for Connections.

Create via API

For more information, see API Connections Create v3.

Additional Configuration Options

Configure for HTTP

By default, the Designer Cloud Powered by Trifacta platform utilizes TCP to connect to Hive. As needed, the platform can be configured to interact with Hive Server 2 over HTTP. 

NOTE: HTTP connections are enabled by inserting additional parameters as part of the connection string. In some environments, the JDBC drive may automatically insert these parameters, based on configuration in hive-site.xml. If these parameters are not inserted, the Designer Cloud Powered by Trifacta platform does not need additional configuration. For more information, see

NOTE: If you are planning to use SSL, additional configuration is required. See below.



To enable the Designer Cloud Powered by Trifacta platform to use HTTP when connecting to Hive, please do the following.

  1. Create a params file for your Hive connection. This file must contain at least the following entry:

        "connectStrOpts": ";transportMode=http;httpPath=cliservice"
  2. Save the file. 
  3. In your CLI command, set the value of --conn_port = 10001.
  4. Execute the command.

  5. Test the Hive connection.

  6. If it's not working, delete the connection and try again.

Configure for SASL

NOTE: Cloudera supports an additional method for communicating over SSL with Hive. For more information on how to identify the method used by your Cloudera cluster, see Configure for Cloudera.

The steps below describe how to enable the SASL-QOP method of SASL (Simple Authentication and Security Layer) communications. To enable, you must add an additional connection string parameter.


  1. Create or edit the params file for your connection to Hive. This file must include the following setting for connectStrOpts:

      "connectStrOpts": ";sasl.qop=auth-conf",
      "defaultDatabase": "default",
      "jdbc": "hive2"
  2. Save the file.
  3. In your CLI command, set the value of --conn_port = 10001.
  4. Execute the command.

  5. Test the Hive connection.

  6. If it's not working, delete the connection and try again.

Configure for Kerberos

For a Kerberos-enabled cluster, you must include the Kerberos principal value as part of the Hive connection string options.

Tip: If you have already created the Hive connection, you can add the connectStrOpts property value below through the application. See Create Connection Window.

Please complete the following steps to update the Kerberos principal value for Hive into your Hive connection.


  1. Retrieve the Kerberos principal value for your deployment.

  2. Apply the Kerberos principal to your Hive connection:
    1. Create connection through CLI: Create or edit the params file for your connection to Hive. This file must include the following setting for connectStrOpts :

        "connectStrOpts": ";principal=hive/<principal_value>",
        "defaultDatabase": "default",
        "jdbc": "hive2"

      <principal_value> is the Kerberos principal (<host>@<realm>) for Hive on your cluster. 
      For more information, see CLI for Connections.

    2. Create connection through the application: Add the Hive principal in the above format to the Connect String Options in your Hive connection. See Create Connection Window.
  3. Save the file.
  4. Test the connection.

Run Hive jobs on a specific YARN queue

If needed, you can route any Hadoop jobs sourced from Hive to a specific YARN job queue. 


  1. In your Hive connection, add the following option to your Connection String Options (connectStrOpts) in your Hive connection:

      "connectStrOpts": ";sess_var_list?<your_queue>",

    <your_queue> is the name of the YARN job queue.

  2. Save the change.
  3. Test the connection. 
  4. Run a job, and verify that it is routed to the appropriate YARN job queue.


This page has no comments.