Page tree

Trifacta Dataprep



Contents:

   

Contents:


Feature Availability: This feature may not be available in all product editions.

This feature is disabled by default. For more information about enabling REST API connectivity in your environment, please contact Alteryx Support.

The REST API connection type provides a generic interface to relational data available through REST APIs. Using this connection type, you can create connections to individual endpoints across hundreds of REST-based applications.

Early Preview connection: This connection is in early preview. It is read-only and available only in SaaS product editions. For more information on early previews, see Early Preview Connection Types.


Limitations

  • Import-only connection type
  • A limited set of request methods is supported. See Method entry below.
  • Using a passphrase when generating an SSH key is not supported.
  • JSON response from API endpoint is required. API endpoints that return XML responses are not supported.
  • OAuth 2.0 authentication is not supported.
  • After the initial connection to an endpoint is made, the schema is cached. The schema is not updated again until the connection is edited.
  • By default, the number of endpoints that you can specify to use an individual connection is 10. This limit can be modified.

Prerequisites

  • You should identify the tables and (optional) data models for them that you wish to access. 

  • You should acquire the credentials to access your target endpoints for one of the supported authentication methods. 

    • If you are using a key or token to access the endpoints, you should generate this token before you begin. 

    • See below. 


Configure

To create this connection, in the Connections page , select the Applications tab. Click the REST API  card. See Connections Page.

Modify the following properties as needed:

PropertyDescription
Base URI

The base URI for the endpoints to which you wish to connect through this connection. Example:

https://exampleserver.sharepoint.com/sites/SharePointTest

Tip: SSL access is supported over the HTTPS protocol.

Connect String Options

Apply any connection string options that are part of your authentication to REST API .

A default string has been provided for you. For more information, see below.

Authentication TypeThe method by which you wish to authenticate to the endpoint. See "Authentication types" below.
API endpointsSpecify the endpoints to which to connect. For more information, see "Configure endpoints" below.
Test Connection

After you have defined the REST API credentials and connection string, you can validate those credentials.

Connection NameDisplay name of the connection
Connection DescriptionDescription of the connection, which appears in the application.

Authentication types

The following types of authentication are supported for REST API connections. For each type, additional properties may require configuration. 

Basic auth

A username/password combination is submitted as part of any request for authentication.

PropertyDescription
UsernameUsername to access the endpoints.
PasswordPassword associated with the username.

HTTP Header Based Auth

Authentication is submitted using a key/value pair submitted in the HTTP request header.

PropertyDescription
Header KeyKey for the header parameter used in authentication
Header ValueCredential associated with header authentication key

HTTP Query Based Auth

Authentication is submitted using a query parameter key/value pair submitted as part of the URL.

PropertyDescription
Query KeyKey for the query parameter used in authentication
Query ValueCredential associated with the query parameter authentication key

Configure endpoints

Each endpoint and method combination must be configured. To add an endpoint, click Add endpoint

The properties are described below.

PropertyDescription
Method

API request method to use. Supported methods:

  • GET - read from the endpoint
  • POST - create a new instance of an object in the target application through the endpoint
  • PUT - modify an existing instance

In some target systems, the PUT and POST methods are required for generating datasets for import. These methods should not be used for other uses cases, such as writing data back to the target system. REST API connections are supported for import only.


NOTE: Other methods are not supported for use.

URL Endpoint

The endpoint that you are accessing using the specified method.

Tip: The Base URI value and this value should form a complete URL.

Table Name(required) The name of the table with which you are interacting through this endpoint.
Data Model

Select the type of model used for the selected table:

  • Document - Data is returned as a document containing top-level elements which are represented as columns in the Trifacta application. Nested data is returned in aggregated form.
  • Relational - Data is returned in tabular form, in which each returned XPath represents an individual table containing a primary key and a foreign key linking to the parent document.
  • Flattened Documents - Data is stored as FlattenedArrays in the source system. A separate table is returned for each object array and is joined to its parent table. The parent table and each child table are joined into a single table for use in the Trifacta application .

For more information on these data model types, see https://cdn.cdata.com/help/DWE/jdbc/pg_RESTParsing.htm.

Pagination

Select the type of pagination to request to the API endpoint. See "Pagination" below.

Advanced options: Custom Header

(optional) You can insert a custom header in the request as a key/value pair.

To add more headers, click Add.

Advanced options: Query Parameter

(optional) You can append a query parameter and value to the URL. These values are appended in the following form:

<endpoint_url>?<key1>=<value1>&<key2>=<value2>

To add more query parameters, click Add.

Advanced options: XPath(optional) You can specify an XPath to be queried of the URL.

Pagination

For the selected endpoint, you can specify the type of pagination in use by the target application. Specifying the pagination allows the Trifacta application to retrieve larger sets of records when pagination is in use. For more information on these pagination methods, see http://cdn.cdata.com/help/DWE/ado/pg_customschemaselect.htm.

None:

(default) No pagination is applied by the endpoint.

Next page URL: 

When this method is selected, the URL of the next page of results is returned as part of the response body. 

  • Page URL path defines the XPath in the response to the attribute containing the URL of the next page. 

Paging token:

A paging token may be returned as part of the response. To acquire the next page of results, this token must be submitted in the subsequent request as the value associated with the paging parameter.

  • Page token path is the XPath to the token that must be submitted with the next request. 
  • Page token param is the parameter in the request into which the page token must be submitted.
  • More pages param (optional) is used when the page token path must be submitted as a query parameter. This value defines the query parameter for which it is submitted. 

Record offset: 

Under this pagination method, subsequent pages of results can be queried based on defining the number of results (records) to offset with the query.

  • Page offset param defines the query parameter where you can specify the page offset to query.

  • Page size param defines the parameter in the request where you define the size in records of each request (page) of records.

  • Page size defines the number of records to request in a page.

Page number:

Similar to record offset, this method queries results based on specified page numbers.

  • Page number param defines the query parameter where you can specify the page number to query.

  • Page size param defines the parameter in the request where you define the size in records of each page of records.

  • Page size defines the number of records to request in a page.

Connect string options

Too many requests error

During execution, you may receive an error similar from the driver to the following:

PlatformErrors: Retries errors based on the exception message. E.g. Other=PlatformErrors="Too Many Requests"
MaximumRequestRetries: The number of times the driver will attempt to retry the request (Default 4)

In this case, the default wait time for retrying a request (2 seconds) is not enough time, and the requests are piling up. You can address this issue by inserting the following connect string options:

Other='RetryWaitTime=15000'

The above option sets the wait time before retrying to 15000ms (15 seconds). You can experiment with this value as needed.

For more information on available connect string options, see https://cdn.cdata.com/help/DWE/ado/Connection.htm.

Create via API

This connection can also be created using the API.

  • Type: jdbc_rest
  • Vendor: jdbc_rest

Example - single GET method

The following example request creates a REST API connection with the following characteristics:

  • Query parameters are used for authentication
  • A single endpoint is enabled:
    • Method: GET
    • Target: table1
    • dataModel: Document  
{
  "vendor": "jdbc_rest",
  "vendorName": "jdbc_rest",
  "name": "REST API test",
  "description": "",
  "type": "jdbc",
  "params": {
    "base_URI": "some base URI",
    "connectStrOpts": ""
  },
  "credentialType": "httpQueryBasedAuth",
  "credentials": [
    {
      "queryKey": "user",
      "queryValue": "token"
    }
  ],
  "endpoints": [
    {
      "tableName": "table1",
      "httpMethod": "get",
      "endpoint": "endpoint1",
      "requestBody": "",
      "xPath": "",
      "dataModel": "document",
      "headers": {},
      "queryParams": {}
    }
  ]
}

Example - rate limiting

The following example creates a REST API connection to polygon.io with the following characteristics:

  • Query-based authentication using key/value pair
  • Connect String Options:
    Other='RetryWaitTime=15000';
    • Wait for retry: 15000 milliseconds
  • Single endpoint to GET stock ticker information:
    • DataModel: Document
    • XPath: $./results
    • Rate limiting on the endpoint (maximum queries per minute): 1000
    • queryParams.date is a parameter that is passed in for this specific connection type.
    • Pagination: nextPageURL method


{
  "vendor": "jdbc_rest",
  "vendorName": "jdbc_rest",
  "name": "REST API",
  "description": "",
  "type": "jdbc",
  "params": {
    "base_URI": "https://api.polygon.io/",
    "connectStrOpts": "Other='RetryWaitTime=15000';"
  },
  "credentialType": "httpQueryBasedAuth",
  "credentials": [
    {
      "queryKey": "apiKey",
      "queryValue": "someKey"
    }
  ],
  "endpoints": [
    {
      "tableName": "tickers",
      "httpMethod": "get",
      "endpoint": "/v3/reference/tickers",
      "requestBody": "",
      "xPath": "$./results",
      "dataModel": "document",
      "headers": {},
      "queryParams": {
        "limit": "1000",
        "date": "2021-11-04T00:00:00Z"
      },
      "pagination": {
        "pageurlpath": "$./next_url",
        "paginationType": "nextPageURL"
      }
    }
  ]
}

For more information, see Dataprep by Trifacta API Reference docs: Enterprise | Professional | Premium | Standard

Use

You can import datasets from REST API  through the Import Data page. See Import Data Page.

Tip: You can perform joins and unions using custom SQL as part of your initial request for data. It may be easier to import the tables as separate datasets and then to perform the join or union within the Trifacta application.


Using REST API Connections

This section describes how you interact through  Dataprep by Trifacta® with your REST data.

Uses

Dataprep by Trifacta can use REST API  connections for the following tasks:

  1. Import datasets

Before you begin

  • Read Access: You must have credentials to create access the specific endpoints required to retrieve your data. 

  • Write Access: Not supported

Secure access

SSL is available over HTTPS for REST API  connections.

Reading data

You can create a Trifacta dataset from the following data models: 

  1. documents
  2. flattened documents
  3. relational tables

These source objects are represented as tables in the import browser and as grid data in the Transformer page.

For more information, see Database Browser.

Writing data

Not supported.

NOTE: Do not use the PUT and POST methods to write data back into the target system.

Reference

  • Read: Supported
  • Write: Not supported

This page has no comments.