The  can be configured to access data stored in relational database sources over JDBC protocol. When this connection method is used, individual database tables and views can be imported as datasets. 

Supported Relational Databases

The  can connect to these relational database platforms. Supported versions are the following:

  • Oracle 12.1.0.2
  • SQL Server 12.0.4
  • PostgreSQL 9.3.10
  • Teradata 14.10+

    NOTE: To enable Teradata connections, you must download and install Teradata drivers first. For more information, see Enable Teradata Connections.


Ports

For any relational source to which you are connecting, the  must be able to access it through the specified host and port value.

Please contact your database administrator for the host and port information.

Enable

This feature is enabled automatically. 

To disable:


To prevent users from connecting to relational datasources for importing datasets and writing results, please complete the following configuration changes:

NOTE: Disabling this feature hides existing relational connections.


  1. Locate the following setting:

    Connectivity feature


  2. Set this value to Disabled.

Disable relational publishing

By default, relational connections are read/write, which means that users can create connections that enable writing back to source databases.

  • When this feature is enabled, writeback is enabled for all natively supported relational connection types. See Connection Types.
  • Depending on the connection type, the writes its data to different field types in the target database. For more information, see Type Conversions.
  • Some limitations apply to relational writeback. See Limitations below.

As needed, you can disable this feature.

Steps:

  1. Locate the following parameter and set it to false:

    "webapp.connectivity.relationalWriteback.enabled": true,


  2. Save changes and restart the platform.

Publishing through relational connections is disabled.

Limitations

Limitations on relational publishing:

When the relational publishing feature is enabled, it is automatically enabled for all platform-native connection types. You cannot disable relational publishing for Oracle, SQL Server, PostgreSQL, or Teradata connection types. Before you enable, please verify that all user accounts accessing databases of these types have appropriate permissions.


NOTE: Writing back to the database utilizes the same user credentials and therefore permissions as reading from it. Please verify that the users who are creating read/write relational connections have appropriate access.


Execution at scale

Jobs for large-scale relational sources can be executed on the Spark running environment. After the data source has been imported and wrangled, no additional configuration is required to execute at scale.

NOTE: End-to-end performance is likely to be impacted by:

  • streaming data volumes over 1 TB from the source,
  • streaming from multiple concurrent sources,
  • overall network bandwidth.

When the job is completed, any temporary files are automatically removed from HDFS. 

For more information, see Run Job Page.

Password Encryption Key File

Relational database passwords are encrypted using key files:

Configure Security

For more information, see Configure Security for Relational Connections.

Configure Connections

Configure at least one individual connection for any of the supported relational systems.  You can configure more than one connection to the same relational system using different credentials. See Connection Types.

Enable SSO Connections

If you have enabled Kerberos on the Hadoop cluster, you can leverage the Kerberos global keytab to enable SSO connections to relational sources. See Enable SSO for Relational Connections.