Page tree

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

 

This is the latest version of the APIs.

Contents:


Get the specified imported dataset.

Version:  v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See API Authentication.

Request

Request Type: GET

Endpoint:

/v4/importedDatasets/<id>

where:

ParameterDescription
<id>Internal identifier for the imported dataset

Endpoint with embedded reference:


Use the following embedded reference to embed in the response data about the connection used to acquire the source dataset if it was created from a Hive or relational connection. 

/v4/importedDatasets/<id>?embed=connection

Request URI - Example:

/v4/importedDatasets/63

Query parameter reference:

The following query parameters can be submitted with this endpoint:

Query ParameterData TypeDescription
embedstring

Comma-separated list of objects to include part of the response.

includeDeletedstring

If set to true, response includes deleted objects.

For more information, see API Common Query Parameters v4.

Request Body:

Empty.

Response

Response Status Code - Success:  200 - OK

Response Body Example:

The following response comes an uploaded file. 

{
    "path": "/trifacta/uploads/1/3630243f-4a20-4996-8fa4-cb190c565358/POS-r01.txt",
    "bucket": null,
    "container": null,
    "type": "hdfs",
    "blobHost": null,
    "isDynamicOrConverted": false,
    "id": 9,
    "dynamicPath": null,
    "isSchematized": true,
    "isDynamic": false,
    "isConverted": false,
    "disableTypeInference": false,
    "hasStructuring": true,
    "createdAt": "2019-01-28T19:54:47.667Z",
    "updatedAt": "2019-01-28T19:54:47.847Z",
    "runParameters": {
        "data": []
    },
    "storageLocation": {
        "fullUri": "hdfs:///trifacta/uploads/1/3630243f-4a20-4996-8fa4-cb190c565358/POS-r01.txt",
        "id": 34,
        "path": "/trifacta/uploads/1/3630243f-4a20-4996-8fa4-cb190c565358/POS-r01.txt",
        "size": "292817",
        "workspaceId": 1,
        "type": "hdfs",
        "bucket": null,
        "blobHost": null,
        "container": null,
        "hash": "215944e85eaabf9fd7ae837b36ff80711abe7ae1",
        "createdAt": "2019-01-28T19:54:47.664Z",
        "updatedAt": "2019-01-28T19:54:47.664Z"
    },
    "size": "292817",
    "creator": {
        "id": 1
    },
    "updater": {
        "id": 1
    },
    "workspace": {
        "id": 1
    },
    "parsingRecipe": {
        "id": 25
    },
    "connection": null
}

Response Body Example with embedded reference:

The following response comes from a relational source and includes embedded information on the connection used to import the data.

{
    "jdbcTable": "datetable",
    "jdbcPath": [
        "public"
    ],
    "columns": [
        "start_date",
        "end_date"
    ],
    "filter": null,
    "raw": null,
    "id": 56,
    "size": "-1",
    "path": null,
    "dynamicPath": null,
    "type": "jdbc",
    "bucket": null,
    "isSchematized": true,
    "isDynamic": false,
    "disableTypeInference": false,
    "createdAt": "2018-01-31T23:51:36.179Z",
    "updatedAt": "2018-01-31T23:51:37.025Z",
    "connection": {
        "id": 2,
        "name": "redshift",
        "description": "",
        "type": "jdbc",
        "isGlobal": true,
        "credentialType": "custom",
        "credentialsShared": true,
        "uuid": "c54bbec0-e05f-11e7-aa39-995f61171ffd",
        "disableTypeInference": false,
        "createdAt": "2017-12-13T23:45:59.468Z",
        "updatedAt": "2017-12-13T23:46:09.039Z",
        "creator": {
            "id": 1
        },
        "updater": {
            "id": 1
        }
    },
    "parsingRecipe": {
        "id": 111
    },
    "relationalSource": {
        "relationalPath": [
            "public"
        ],
        "columns": [
            "start_date",
            "end_date"
        ],
        "filter": null,
        "raw": null,
        "id": 10,
        "tableName": "datetable",
        "createdAt": "2018-01-31T23:51:36.187Z",
        "updatedAt": "2018-01-31T23:51:36.187Z",
        "importedDataset": {
            "id": 56
        }
    },
    "runParameters": {
        "data": []
    },
    "name": "datetable",
    "description": "my table of dates",
    "creator": {
        "id": 1
    },
    "updater": {
        "id": 1
    }
}

Response Body Example dataset with parameters:

The following example body illustrates a dataset id=29) that has been created with a regular expression pattern parameter. 

{
    "id": 29,
    "size": "292817",
    "path": "/trifacta/uploads/1/efeb54fc-efee-4d5f-a92b-a44c09c60aaa/POS-r01.txt",
    "dynamicPath": "/trifacta/uploads/1/efeb54fc-efee-4d5f-a92b-a44c09c60aaa/POS-r.txt",
    "type": "hdfs",
    "bucket": null,
    "blobHost": null,
    "container": null,
    "isSchematized": true,
    "isDynamic": true,
    "disableTypeInference": false,
    "createdAt": "2018-03-26T22:33:17.386Z",
    "updatedAt": "2018-03-26T22:33:18.337Z",
    "parsingRecipe": {
        "id": 43
    },
    "runParameters": {
        "data": [
            {
                "value": {
                    "pattern": {
                        "regex": {
                            "value": "[0-9][0-9]"
                        }
                    }
                },
                "insertionIndices": [
                    {
                        "index": 62,
                        "order": 0
                    }
                ],
                "id": 2,
                "type": "path",
                "createdAt": "2018-03-26T22:33:17.533Z",
                "updatedAt": "2018-03-26T22:33:17.662Z",
                "runParameterEdit": {
                    "value": {
                        "pattern": {
                            "regex": {
                                "value": "[0-9][0-9]"
                            }
                        }
                    },
                    "insertionIndices": [
                        {
                            "index": 62,
                            "order": 0
                        }
                    ],
                    "id": 2,
                    "overrideKey": null,
                    "createdAt": "2018-03-26T22:33:17.658Z",
                    "updatedAt": "2018-03-26T22:33:17.658Z",
                    "runParameter": {
                        "id": 2
                    },
                    "creator": {
                        "id": 1
                    },
                    "updater": {
                        "id": 1
                    }
                },
                "importedDataset": {
                    "id": 29
                },
                "creator": {
                    "id": 1
                },
                "updater": {
                    "id": 1
                },
                "overrideKey": null
            }
        ]
    },
    "name": "Dataset with Parameters",
    "description": "Dataset with parameters using regular expressions",
    "creator": {
        "id": 1
    },
    "updater": {
        "id": 1
    },
    "connection": null
}

 

Reference

Common Properties:

The following properties are common to file-based and JDBC datasets.

PropertyDescription
path

For HDFS and S3 file sources, this value defines the path to the source.

For JDBC sources, this value is not specified.

For uploaded sources, this value specifies the location on the default backend storage layer where the dataset has been uploaded.

bucket(If type=s3) Bucket on S3 where source is stored.
container

(Azure only) If the dataset is stored in on ADLS, this value specifies the container on the blob host where the source is stored.

type

Identifies where the type of storage where the source is located. Values:

  • hdfs
  • s3
  • jdbc
blobHost

(Azure only) If the dataset is stored in on ADLS, this value specifies the blob host where the source is stored.

isDynamicOrConvertedProperty is true if the dataset is either a dynamic or converted dataset.

id

Internal identifier of the imported dataset
dynamicPath(Dataset with parameters only) Specifies the path without the parameters inserted into it. Full path is defined based on this value and the data in the runParameters area.
isSchematized(If source file is avro, or type=jdbc) If true, schema information is available for the source.
isDynamicIf true, the imported dataset is a dynamic dataset (dataset with parameters). For more information, see Overview of Parameterization.
isConvertedIf true, the imported dataset has been converted to CSV format for storage.
disableTypeInference

If true, the initial type inferencing performed on schematized sources by the Trifacta platform is disabled for this source. For more information, see Configure Type Inference.

hasStructuringIf true, initial parsing steps have been applied to the dataset.
createdAtTimestamp for when the dataset was imported
UpdatedAtTimestamp for when the dataset was last updated
runParametersIf runtime parameters have been applied to the dataset, they are listed here. See below for more information.
name 
sizeSize of the file in bytes (if applicable)
nameInternal name of the imported dataset
descriptionUser-friendly description for the imported dataset
creator.idInternal identifier of the user who created the imported dataset
updater.idInternal identifier of the user who last updated the imported dataset
workspace.idInternal identifier of the workspace into which the dataset has been imported.
parsingRecipe.idIf initial parsing is applied, this value contains the internal identifier of the recipe that performs the parsing.
connection.id

Internal identifier of the connection to the server hosting the dataset.

If this value is null, the file was uploaded from a local file system.

To acquire the entire connection for this dataset, you can use either of the following endpoints:

/v4/importedDatasets?embed=connection
/v4/importedDatasets/:id?embed=connection

For more information, see API Connections Get v4.

runParameters reference:

The following properties are available in the runParameters area:

PropertyDescription
value.pattern.regex.valueRegular expression that is applied to the path.
insertionIndices.indexIndex value for the location in the path where the parameter is applied
insertionIndices.order

Any applicable ordering applied to the values in the parameter.

  • 0 - ascending
  • 1 - descending
idInternal identiier for the parameter.
typeType of parameter. This value must be path.
createdAtTimestamp for when the parameter was created
updatedAtTimestamp for when the parameter was updated
runParameterEditAny runtime overrides applied to the parameter during job execution
importedDataset.idInternal identifier for the dataset to which the parameter is applied
creator.idInternal identifier of the user who created the dataset with parameters
updater.idInternal identifier of the last user who modified the dataset with parameters
overrideKeyAny override values applied to the dataset with parameters at run time

 

storageLocation reference:

The following properties are available in the storageLocation area:

PropertyDescription
fullUriThe full URI to the location where the dataset is stored.
idInternal identifier of the imported dataset
path

For HDFS and S3 file sources, this value defines the path to the source.

For JDBC sources, this value is not specified.

For uploaded sources, this value specifies the location on the default backend storage layer where the dataset has been uploaded.

sizeSize of the file in bytes (if applicable)
workspaceIdInternal identifier of the workspace into which the dataset has been imported.
type

Identifies where the type of storage where the source is located. Values:

  • hdfs
  • s3
  • jdbc
bucket(If type=s3) Bucket on S3 where source is stored.
blobHost

(Azure only) If the dataset is stored in on ADLS, this value specifies the blob host where the source is stored.

container

(Azure only) If the dataset is stored in on ADLS, this value specifies the container on the blob host where the source is stored.

hash

Hash value for the imported dataset.

Tip: Changes in this value indicate that the source file has been modified.

createdAtTimestamp for when the dataset was imported
updatedAtTimestamp for when the dataset was last updated

 

Hive or Relational Source:

If the source data is from Hive or a relational system (type=jdbc), the following properties contain information on the source table, the imported columns, and any custom SQL filters applied to the table. Other properties are part of the common set.

PropertyDescription
jdbcTable

Name of the table from which the data is extracted.

If a custom SQL query has been applied, this value is null .

jdbcPath

Name of the database from which the source was queried.

If a custom SQL query has been applied, this value is null .

columns

List of columns imported from the source, pre-filtered.

If a custom SQL query has been applied, this value is null .

filterThis value is empty.
raw

If custom SQL has been applied to the data source to filter the data before it is imported, all SQL statements are listed.

For more information, see Enable Custom SQL Query.

idInternal identifier for the relational source
SizeSize in bytes of the data. For relational sources, this value is -1, as the data is not available.

File:

File-based datasets support the common properties only.

Embedded connection:

For more information on the properties when the connection is embedded in the response, see API Connections Create v4.

  • No labels

This page has no comments.