Using the following URL endpoint, you can create a dataset from another application through the .
NOTE: This integration is not supported in the |
NOTE: Before using any UI integration, you must first login to the application. If you are not logged in, you are redirected to the login page, where you can input your credentials before reaching your target URL. |
In addition to authentication with the , the authenticated user must also have the appropriate permissions to access the assets on the datastore. This includes:
For more information:
Topic | Section |
---|---|
HDFS: permissions and security | See Configure Hadoop Authentication. |
HDFS: usage | See Using HDFS. See HDFS Browser. |
S3: permissions and security | See Enable S3 Access. |
S3: usage | See Using S3. See S3 Browser. |
You can use this integration to create datasets from single files or a single directory. Below are some example URLs for sources from Hadoop HDFS or S3:
Datastore | Source Type | Example URL | Results |
---|---|---|---|
HDFS | Directory | hdfs:///user/warehouse/campaign_data/ | User can choose the file through the UI to use for the dataset. |
File | hdfs:///user/warehouse/campaign_data/d000001_01.csv | User can complete the steps through the UI to create the dataset. | |
S3 | Directory | s3:///3fad-demo/data/biosci/source/ | User can choose the file through the UI to use for the dataset. |
File | s3:///3fad-demo/data/biosci/source/1-DRUG15Q1.txt | User can complete the steps through the UI to create the dataset. |
NOTE: The above results assume that the user has the appropriate permissions to access the file or directory. If the user lacks permissions, an HTTP 404 error is displayed. |
Steps:
HDFS (file):
hdfs:///user/warehouse/campaign_data/d000001_01.csv |
S3 (directory):
s3:///3fad-demo/data/biosci/source/ |
Navigate the browser to the appropriate URL in the . The following example applies to the HDFS file example from above. It must be preceded by the base URL for the platform. For more information, see API - UI Integrations.
<base_url>/import/data?uri=hdfs:///user/warehouse/campaign_data/d000001_01.csv |
Example:
Dataset URL | flowId | datasetId |
---|---|---|
| 31 | 186 |
The flowId is consistent across all datasets that you imported through the above steps.
You can run jobs on the dataset through the following interfaces:
API: See API JobGroups Create v4.