Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. To create an imported dataset, you must acquire the following information about the source. In the above example, the source is the POS-r01.txt file.

    1. path
    2. type
    3. name
    4. description
    5. bucket (if a file stored on S3)
  2. Construct the following request:

    Endpointhttp://www.example.com:3005/v3/importedDatasetimportedDatasets
    AuthenticationRequired
    MethodPOST
    Request Body


    Code Block
    {
      "path": "/user/pos/POS-r01.txt",
      "type": "hdfs",
      "bucket": null,
      "name": "POS-r01.txt",
      "description": "POS-r01.txt"
    }



  3. You should receive a 201 - Created response with a response body similar to the following:

    Code Block
    {
      "id": 8,
      "size": "281032",
      "path": "/user/pos/POS-r01.txt",
      "isSharedWithAll": false,
      "type": "hdfs",
      "bucket": null,
      "isSchematized": false,
      "createdBy": 1,
      "updatedBy": 1,
      "updatedAt": "2017-02-08T18:38:56.640Z",
      "createdAt": "2017-02-08T18:38:56.560Z",
      "connectionId": null,
      "parsingScriptId": 14,
      "cpProject": null
    }


  4. You must retain the id value so you can reference it when you create the recipe.

  5. See API ImportedDatasets Create v3.

  6. Next, you create the recipe. Construct the following request:

    Endpointhttp://www.example.com:3005/v3/wrangledDataset
    AuthenticationRequired
    MethodPOST
    Request Body


    Code Block
    { "name":"POS-r01",
      "importedDataset":{"id":8},
      "flow":{"id":10}
    }



  7. You should receive a 201 - Created response with a response body similar to the following:

    Code Block
    {
      "id": 23,
      "flowId": 10,
      "scriptId": 24,
      "wrangled": true,
      "createdBy": 1,
      "updatedBy": 1,
      "updatedAt": "2017-02-08T20:28:06.067Z",
      "createdAt": "2017-02-08T20:28:06.067Z",
      "flowNodeId": null,
      "deleted_at": null,
      "activesampleId": null,
      "name": "POS-r01",
      "active": true
    }


  8. From the recipe, you must retain the value for the id. For more information, see API WrangledDatasets Create v3.
     
  9. Repeat the above steps for each of the source files that you are adding to your flow.

...

Endpointhttp://www.example.com:3005/v3/jobgroupjobGroups/<id>/status
AuthenticationRequired
MethodGET
Request BodyNone.

...

Endpointhttp://www.example.com:3005/v3/jobGroups/<id>
AuthenticationRequired
MethodPOST
Request Body


Code Block
{
  "wrangledDataset": {
    "id": 23
  }
}


...