The can be configured to access data stored in relational database sources over JDBC protocol. When this connection method is used, individual database tables and views can be imported as datasets.
Supported Relational Databases
The can connect to these relational database platforms. Supported versions are the following:
For any relational source to which you are connecting, the must be able to access it through the specified host and port value.
Please contact your database administrator for the host and port information.
This feature is enabled automatically.
NOTE: For this release, relational connectivity is read-only. Writing to relational databases is not supported.
There are some differences in behavior between reading tables and views. See Using Databases.
Jobs for large-scale relational sources can be executed on the Hadoop running environment. After the data source has been imported and wrangled, no additional configuration is required to execute at scale.
NOTE: End-to-end performance is likely to be impacted by:
When the job is completed, any temporary files are automatically removed from HDFS.
For more information, see Run Job Page.
Before you create relational connections, you must create and reference an encryption key file, see Create Encryption Key File.
If you are experiencing very long query times, particularly for datasets backed by database views, you should consider raising the application timeout settings. These settings apply to the entire application and may have secondary effects. For more information, see Miscellaneous Configuration.
Configure at least one individual connection for any of the supported relational systems. You can configure more than one connection to the same relational system using different credentials. See Connection Types.
If you have enabled Kerberos on the Hadoop cluster, you can leverage the Kerberos global keytab to enable SSO connections to relational sources. See Enable SSO for Relational Connections.