Page tree

Trifacta Dataprep



Contents:

   

Contents:


Feature Availability: This feature is available in the following editions:

  • Dataprep by Trifacta Enterprise Edition
  • Dataprep by Trifacta Professional Edition
  • Dataprep by Trifacta Premium


You can create connections to SFTP servers to upload your datasets to the Trifacta® application

Linux- and Windows-based SFTP servers are supported. 

Jobs can be executed from SFTP sources on the following running environments:

  • Trifacta Photon
  • Dataflow


Read: Supported

Write: Not supported

Limitations

  • Read-only connection

  • Files and folders with spaces or special characters in them cannot be used. For example, a file or folder on the SFTP server with a hashtag (#) in it cannot be used for data.
    • Files and folders whose names begin with underscore (_) are not visible.
  • Ingest of over 500 files through SFTP at one time is not supported.

  • For private SFTP servers, you cannot run jobs on Dataflow. These jobs must be run using Trifacta Photon.

  • You cannot run jobs using Avro or Parquet sources uploaded via SFTP.
  • File types, such as Excel or PDF, that require use of the conversion service cannot be imported via SFTP connections.


Prerequisites

  • Acquire user credentials to access the SFTP server. You can use username/password credentials or SSH keys. See below.

  • Verify that the credentials can access the proper locations on the server where your data is stored. Initial directory of the user account must be accessible.

SSH Keys

If preferred, you can use SSH keys to for authentication to the SFTP server.

NOTE: SSH keys must be private RSA keys. If you have OpenSSH keys, you can use the ssh-keygen utility to convert them to private RSA keys.

Whitelist SFTP server


If your SFTP server is private, you must add the SFTP server to the whitelist of IPs that are permitted to communicate with the cluster. For more information, please see the documentation that is provided with your software distribution. 

You must also add the SFTP server to the whiitelist of file storage systems. Details are below.

Create Connection

Create through application

You can create a SFTP connection through the Trifacta application.

Steps:

  1. In the left nav bar, select the Connections icon. See Connections Page.
  2. In the Connections page, click Create Connection. See Create Connection Window.
  3. In the Create Connection window, click the SFTP connection card.
  4. Specify the properties for your SFTP server.

    PropertyDescription
    Host

    The hostname of the FTP server to which you are connecting. Do not include any protocol identifier (sftp://).

    PortThe port number to use to connect to the server. Default port number is 22.
    Credential Type

    Select one of the following:

    basic - authenticate via username and password

    SSH Key - authenticate via username and SSH key

    User NameThe username to use to connect.
    Password(Basic credential type) The password associated with the username.
    SSH Key(SSH Key credential type) The SSH key that applies to the username.
    Test ConnectionClick this button to test the connection that you have specified.
    Default Directory

    Absolute path on the SFTP server where users of the connection can begin browsing.

    Block Size (Bytes)

    Fetch size in bytes for each read from the SFTP server.

    NOTE: Raising this value may increase speed of read operations. However, if it is raised too high, resources can become overwhelmed, and the read can fail.

    Connection NameThe name of the connection as you want it to appear in the application.
    DescriptionThis description is displayed in the application.

    For more information, see Create Connection Window.

  5. Click Save

Create through APIs

  • Type: jdbc
  • Vendor: sftp

For more information, see  Dataprep by Trifacta API Reference docs: Enterprise | Professional | Premium | Standard

This page has no comments.