Contents:
The command line references lets you manage connections between the Designer Cloud Powered by Trifacta® platform and various types of datastores. You can also use this CLI for the following:
Create, edit, or delete connections
NOTE: In this release, you cannot create Redshift or SQL DW connections via the CLI. This known issue will be fixed in a future release.
- Get information on all connections
NOTE: Sharing of connections is not supported through the command line interface.
Requirements
- The CLI must have access to a running Alteryx instance. You can specify the host and port of this instance.
For each connection that you create, the Alteryx node must be able to access it through the listed host and port.
Command Reference
The CLI tools are stored in the following directory:
/opt/trifacta/bin/
For creating or modifying connections, execute the following command:
./trifacta_cli.py (parameters)
Parameters are specified below.
Parameters
Common
These parameters are common to job or connection actions.
Parameter | Description | Applicable CLI Commands |
---|---|---|
command_type | (Required) The type of CLI command to execute. Accepted values:
See Examples below. For more information on the following commands, see CLI for Jobs.
| All |
user_name | (Required) Alteryx username of the user to execute the job. Please specify the full username. NOTE: In the response, this value is listed as | All |
password | (Required) Alteryx password for the username NOTE: If you have enabled Kerberos-based access to the Designer Cloud Powered by Trifacta platform, you do not need to provide a password. To enable, additional configuration is required. See Set up for a Kerberos-enabled Hadoop cluster. NOTE: Passwords can be stored in an external file. See CLI Config File. | All |
cli_output_path | (Optional) Defines the client-side path where the JSON output is stored for all commands. Default value is NOTE: The user issuing the command must also have execute permissions on all parent folders in the specified | All |
disable_ssl_certification | (Optional) When communicating over HTTPS, this setting can be used to override the default behavior of validating the server certificate before executing the command. NOTE: If you have stored a self-signed certificate on the Alteryx node, please set the | All |
conn_ssl | (Optional) Connect to the datastore over SSL. NOTE: You must modify the NOTE: SSL connections are not supported for Hive, Redshift, or SQL Server.
| All |
Params for managing connections
The following parameters apply to managing connection objects only. Some of the preceding parameters may be required for connection actions.
Parameter | Description | Applicable CLI Command |
---|---|---|
conn_type | The type of connection. NOTE: After the connection has been created, you cannot change its type. Tip: For a list of supported connection types, enter the following at the command line: ./trifacta_cli.py create_connection -h
These connection types can be created by Alteryx admin users only: NOTE: These connections must be created through the CLI and must be created as public connections. Include the NOTE: Only 1 Hive and 1 Redshift connection is permitted per Alteryx deloyment.
These connection types can be created by any user with appropriate permissions: NOTE: Jobs using sources from these connections cannot be executed on Spark.
For more information on the supported connection types and the tokens to insert for this parameter, see Connection Types. | create_connection |
conn_name | Internal name of the connection. This name is referenced in your CLI scripts. It should be a single value without spaces. NOTE: This value must be unique among your connection names. | create_connection , edit_connection , list_connection , delete_connection |
conn_id | The internal identifier for the connection. When a connection is created, it is assigned an internal numeric identifier. This ID or the connection_name can be used to reference the connection in future commands. Tip: This value is available when you hover over a connection in the application. See Flows Page. | edit_connection , list_connection , delete_connection |
conn_host | Host of the datastore to which you are connecting. | create_connection , edit_connection |
conn_port | Port number to access the datastore. | create_connection , edit_connection |
conn_description | This text value is displayed to users when they create or edit connections of this type through the Designer Cloud application. | create_connection , edit_connection |
conn_credential_type | The type of credentials to create. Supported values:
| create_connection , edit_connection |
conn_credential_location | The path to a JSON file containing the credentials for your connection, as consistent with the NOTE: A credential file is not needed if the credential type is | create_connection, edit_connection |
conn_params_location | When you create a connection, you can reference a JSON file containing parameters to apply during the creation of any connection of this type. See Params file below. | create_connection , edit_connection |
conn_skip_test | If this parameter is added to the command, the connection is not tested. The default is to test the connection. This flag requires no value. Tip: After creation, you can test and modify the connection through the application. See Flows Page. | create_connection , edit_connection |
conn_is_global | If this parameter is added, the connection is public and is available to all Alteryx users after it has been created. This flag requires no value. NOTE: To use this option, the executing user must be a Alteryx admin. Hive and Redshift connections require this parameter. NOTE: After a connection has been made public, it cannot be made private again. It must be deleted and recreated. | create_connection , edit_connection |
For documentation on the CLI parameters, run:
./trifacta_cli.py --help
Additional documentation might be available for individual commands using the following:
./trifacta_cli.py <commmand> --help
Credentials file
You can store connection login credentials in a file on the Alteryx node. When managing connections, you can reference this JSON credentials file in the command, which forces the use of encrypted versions of the credentials stored in the file. Examples are provided below.
Example - Basic credentials:
This example applies for relational connection types: Oracle, PostGreSQL, SQL Server, and Teradata.
{ "username": "<your_username>", "password": "<your_password>" }
Example - AWS credentials:
This example applies to connections of AWS type (Redshift).
{ "username": "<your_user>", "password": "<your_password>" "iamRoleArn": "<your_IAM_role_ARN>" }
NOTE: iamRoleArn
is optional. For more information, see Configure for EC2 Role-Based Authentication.
Params file
In an external file, you can create a set of parameters to pass to any object for which you are creating a connection. For example, when you create a connection to a database, you may need to reference a default database to which any instance of the connection connects.
The following parameters are supported for each vendor.
Vendor | JSON Parameter | Description | Required |
---|---|---|---|
Hive | defaultDatabase | Name of the default database | No |
Redshift | defaultDatabase | Name of the default database | Yes |
PostgreSQL | database | Name of the database. | Yes |
Oracle | service | Service to use for the connection | Yes |
SQL Server | None. | ||
Teradata | None. |
Additional parameters:
Except for Redshift connections, you can submit additional configuration parameters using the ConnectStrOpts
key-value pair in the parameters file. Example:
"connectStrOpts": ";transportMode=http;httpPath=cliservice"
NOTE: Each vendor uses a specific separator between the connection URL and the connection string options. For example, if you are creating or editing a Teradata connection and are submitting ConnectStrOpts
parameters, the string value must begin with a comma:
"connectStrOpts": ",Key1=Value1,Key2=Value2"
For more information, see the documentation provided with your database product.
Redshift uses the extraLoadParams
method, which is described below.
Arbitrary parameters for JDBC connections:
For any supported JDBC connection type, you can include arbitrary parameters specific to the JDBC database as part of your connection string options.
With the exception of Oracle, all JDBC vendor support the following example. The database to which you are connecting supports a parameter (myView
) which for the database has a value of custom1
. To extend the preceding Teradata example, your connectStrOpts
value would be the following, which begins with a comma (,
):
"connectStrOpts": ",Key1=Value1,Key2=Value2?myView=custom1"
When the connection is created and used, the connection string might look like the following:
jdbc:teradata://example.com:1025/DatabaseServerName?myView=custom1
For submitting arbitrary parameters to Oracle, please see the example below.
Example - Hive params:
NOTE: By default, the Hive connection is defined to use TCP. If you are using HTTP to connect to Hive, additional configuration is required, including insertion of additional parameters in your params file. See Configure for Hive.
NOTE: If you are connecting to a Kerberos-enabled cluster, you must include the Kerberos principal for Hive as part of the connectStrOpts
value. See Configure for Hive.
{ "connectStrOpts": ";<depends_on_deployment>", "defaultDatabase": "default", "jdbc": "hive2" }
For more information on connection string options for Hive, see Configure for Hive.
Example - Redshift params:
{ "defaultDatabase":"<your_database>", "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS" }
The first parameter defines the default database.
The second parameter is used when you publish results to Redshift. For more information on these values, see http://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-conversion.html.
Example - PostgreSQL params:
{ "database":"<your_database>" }
Example - Oracle params:
{ "service":"orcl" }
For submitting arbitrary parameters to Oracle, the arbitrary string must follow the ORA format, in which most of the connection string is replaced by parameters. For example:
{ "connectStrOpts": "(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=localhost)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=orcl)))", "service":"orcl" }
In this case, the generated connection string might look like the following:
jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=localhost)(PORT=1521))(CONNECT_DATA(SERVICE_NAME=orcl)))
The original host, port, and service name values specified in the connection are ignored and replaced by these values.
Examples
At the command line, all jobs must be executed through connection objects. For each datastore to which the Designer Cloud Powered by Trifacta platform is connected, you must create at least one connection object and then reference it in any job execution tasks.
Create connection
NOTE: For Hive and Redshift, connections must be created through the CLI by a Alteryx admin user and must be created as public connections (include the --conn_is_global
flag). You can only create one connection of each of these types.
For more information on creating a Redshift connection through the CLI, see Create Redshift Connections.
For more information on creating a Hive connection through the CLI, see Configure for Hive.
Command
Example (all one command):
./trifacta_cli.py create_connection --user_name <trifacta_user> --password <trifacta_password> --conn_type microsoft_sqlserver --conn_name aSQLServerConnection --conn_description "This is my connection." --conn_host example.com --conn_port 1234 --conn_credential_type basic --conn_credential_location ~/.trifacta/config_conn.json --conn_params_location ~/.trifacta/p.json --cli_output_path ./conn_create.out
Output
Success: Connection aSQLServerConnection created JSON results written to conn_create.out.
JSON Response
Output is written to ./conn_create.out
.
{ "conn_credential_location": "~/.trifacta/config_conn.json", "conn_credential_type": "basic", "conn_host": "example.com", "conn_id": 9, "conn_name": "aSQLServerConnection", "conn_params_location": "~/.trifacta/p.json", "conn_port": "1234", "conn_type": "microsoft_sqlserver", "host": "http://example.com:3005", "results": { "createdAt": "2016-06-30T21:53:58.977Z", "createdBy": 3, "credential_type": "basic", "credentials": [ { "username": "<trifacta_user>" } ], "deleted_at": null, "description": null, "host": "example.com", "id": 9, "is_global": false, "name": "aSQLServerConnection", "port": 1234, "type": "microsoft_sqlserver", "updatedAt": "2016-06-30T21:53:58.977Z", "updatedBy": 3 }, "status": "success", "user_name": "<trifacta_user>" }
Edit connection
Command
In the following command, all parameters specified within angled brackets are optional settings that can be changed. The other ones are required to perform any edit.
You must specify the conn_name
or the conn_id
.
NOTE: If you are editing the connection's credentials, you must specify the conn_credential_type
in the command, which is required if you are changing any credential parameter. This step completely replaces the old credentials, so you must specify all connection parameters in the command.
Example (all one command):
./trifacta_cli.py edit_connection --user_name <trifacta_user> --password <trifacta_password> --conn_name aSQLServerConnection <--conn_type microsoft_sqlserver> <--conn_description "This is my connection."> <--conn_host mynewhost.com> <--conn_port 1234> <--conn_credential_type basic> <--conn_credential_location ~/.trifacta/config_conn.json> <--cli_output_path ./conn_edit.out>
Output
Following assumes that only the above values for host
and cli_output_path
contain new values:
Success: Updated connection aSQLServerConnection JSON results written to conn_edit.out.
JSON Response
Output is written to ./conn_edit.out
.
{ "conn_description": "This is my connection.", "conn_id": 9, "conn_name": "aSQLServerConnection", "conn_params_location": "~/.trifacta/p.json", "host": "http://nynewhost.com:3005", "results": { "createdAt": "2016-06-30T22:08:47.016Z", "createdBy": 3, "credential_type": "basic", "credentials": [ { "username": "<trifacta_user>" } ], "deleted_at": null, "description": "This is my connection.", "host": "mynewhost.com", "id": 9, "is_global": false, "name": "aSQLServerConnection", "port": 1234, "type": "microsoft_sqlserver", "updatedAt": "2016-06-30T22:09:03.670Z", "updatedBy": 3 }, "status": "success", "user_name": "<trifacta_user>" }
List connections
Command
Example (all one command):
./trifacta_cli.py list_connections --host dev.redshift.example.com --user_name <trifacta_user> --password <trifacta_password> --cli_output_path ./conn_list.out
Tip: You can specify a conn_name
or conn_id
to return the information about a connection.
Output
Listing connections Found 2 connections for params {'noLimit': 'true'}. Redshift: description: None host: dev.redshift.example.com credentials: ["{u'username': u'<trifacta_user>'}"] port: 5439 is_global: True name: Redshift id: 2 credential_type: custom params: extraLoadParams: BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS defaultDatabase: dev type: amazon_redshift Hive: description: None host: dev.hive.example.com credentials: ["{u'username': u'<trifacta_user>'}"] port: 10000 is_global: True name: Hive id: 1 credential_type: conf params: jdbc: hive2 connectStrOpts: defaultDatabase: default type: hadoop_hive JSON results written to conn_list.out.
JSON Response
Output is written to ./conn_list.out
.
{ "connections": [ { "conn_createdAt": "2016-06-01T21:12:59.383Z", "conn_createdBy": 2, "conn_credential_type": "custom", "conn_credentials": [ { "username": "<trifacta_user>" } ], "conn_deleted_at": null, "conn_description": null, "conn_host": "dev.redshift.example.com", "conn_id": 2, "conn_is_global": true, "conn_name": "Redshift", "conn_params": { "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS", "defaultDatabase": "dev" }, "conn_port": 5439, "conn_type": "amazon_redshift", "conn_updatedAt": "2016-06-01T21:33:38.672Z", "conn_updatedBy": 2 }, { "conn_createdAt": "2016-06-01T21:11:41.222Z", "conn_createdBy": 2, "conn_credential_type": "conf", "conn_credentials": [ { "username": "<trifacta_user>" } ], "conn_deleted_at": null, "conn_description": null, "conn_host": "dev.hive.example.com", "conn_id": 2, "conn_is_global": true, "conn_name": "Hive", "conn_params": { "jdbc": "hive2", "connectStrOpts": "", "defaultDatabase": "default" }, "conn_port": 10000, "conn_type": "hadoop_hive", "conn_updatedAt": "2016-06-01T21:39:58.090Z", "conn_updatedBy": 2 } ], "host": "http://localhost:3005", "status": "success", "user_name": "<trifacta_user>" }
Delete connection
NOTE: You cannot delete a connection that is in use by a dataset.
Tip: You can delete a connection by using its internal connection identifier (conn_id
), instead of its connection name.
Command
Example (all one command):
./trifacta_cli.py delete_connection --user_name <trifacta_user> --password <trifacta_password> --conn_name aSQLServerConnection --cli_output_path ./conn_delete.out
Output
Success. Deleted connection with id: 9 JSON results written to conn_delete.out.
JSON Response
Output is written to ./conn_delete.out
.
{ "conn_name": "aSQLServerConnection", "host": "http://localhost:3005", "status": "success", "user_name": "<trifacta_user>" }
This page has no comments.