Skip to main content

Configure Connectivity

This section covers the following areas around general connectivity of the Designer Cloud Powered by Trifacta platform.

Additional configuration may be required for individual connection types. For more information, see Connection Types.

Enable

The platform automatically enables connectivity to relational databases for reading in datasets and writing results back out.

Note

Relational connectivity requires the use of an encryption key file, which must be created and deployed before you create relational connections. For more information, see Create Encryption Key File in the Install Guide.

Data Service

The platform streams records from relational sources through the data service. These records are applied to transformation and sampling jobs on the Trifacta Photon running environment.

Tip

In general, you should not have to modify settings for the data service. However, if you are experiencing general performance issues or issues with specific connection types, you may experiment with settings in the data service.

For more information, see Configure Data Service in the Configuration Guide.

Relational Features

Custom SQL Query

To enhance performance of your relational datasets, you can enable the use of custom SQL queries against your relational datasources, which allows you to pre-filter your datasets before you ingest them into the platform. This feature is enabled by default, but additional configuration can be applied. See Enable Custom SQL Query.

JDBC Ingestion

By default, the platform ingests data from your relational datasources to the base storage layer for faster job execution. See Configure JDBC Ingestion.

Append hadoop principal to logged queries

Optionally, you can choose to enable appending the Hadoop principal as a comment to the SQL queries that are written to your database logs.

Note

This feature only applies if you are connecting to a Hadoop cluster. Otherwise, no value is inserted if this feature is enabled.

Example:

execute <unnamed>: SELECT * FROM "public"."artifacts" LIMIT 10 /* <hadoopPrincipal> */

In the above, the value of the Hadoop principal is written in the comment.

These types of queries are logged for the following basic activities:

  • Data preview: when previewing data from a relational source, a query is executed against the database

  • Data import: when selecting a table to import

  • Data import using custom SQL:

    • Click Validate button.

    • Custom SQL execution.

This feature enables auditing of Alteryx user activities through your database logs in Hadoop-based environments.

Steps:

  1. You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

  2. Locate the following setting and set it to true:

    "feature.addUserIdToSQLQuery.enabled": false,
  3. After enabling the feature, locate the following setting, which defines the connection types for which the userId is written. Each connection type must be explicitly added to the list.

    1. These are the default connection types to which it is applied:

      "feature.addUserIdToSQLQuery.enabledConnectors": "teradata,sqlserver,oracle,db2,mysql",
    2. To apply this feature to other connection types, insert the connection type value to the comma-separated list. In the following example, PostgreSQL connections has been added to the list of default connection type:

      "feature.addUserIdToSQLQuery.enabledConnectors": "teradata,sqlserver,oracle,db2,mysql,postgres",
    3. You can acquire the value to insert by reviewing a connection of the same type in the Trifacta Application. When you select the connection in the Trifacta Application, the appropriate value to insert is listed for Connection type in the connection information.

    4. If the above setting is set to an empty string (""), the feature is applied to no connections and is effectively disabled.

  4. Save your changes and restart the platform.

Enable Driver Logging

Optionally, you can enable the inclusion of log entries from the driver underlying a relational connection.

Note

This option applies only to relational connections that rely on CData drivers. Some connections may not support this option.

When you create or edit a relational connection, insert the following as part of the Connect String Options:

logfile=STDOUT://;verbosity=5;

Log entries are included in the data-service.log file is included in the standard Support Bundle. For more information, see Support Bundle Contents.

Configure Security

For more information, see Configure Security for Relational Connections.

Enable SSO Connections

If you have enabled Kerberos on the Hadoop cluster, you can leverage the Kerberos global keytab to enable SSO connections to relational sources. See Enable SSO for Relational Connections.

Type Inference

By default, the platform applies type inferencing to all imported datasources. However, for schematized sources, you may wish to disable type inferencing from the platform instead relying on the types provided from the source.

Tip

You can also toggle the use of type inferencing for individual connections or for individual imported datasets.

For more information, see Configure Type Inference.

Enable OAuth 2.0 Connectivity

Some supported relational datastores support authentication using OAuth 2.0.

  • For each system to which you want to connect, you must create a client app in the target system. For more information, see Enable OAuth 2.0 Authentication.

  • For a target system for which you have created a client app, you must create at least one client in the Trifacta Application. For more information, see Create OAuth2 Client.