If needed, you can specify multiple file or table targets as part of a single CLI job. In your CLI command, the path on the Alteryx node to this JSON file is specified as the publish_opt_file
parameter, as in the following:
./trifacta_cli.py run_job --user_name <trifacta_user> --password <trifacta_password> --job_type spark --data redshift-test/datasources.tsv --script redshift-test/script.cli --cli_output_path ./job_info.out --profiler on --publish_opt_file /json/publish/file/publishopts.json
The file publishopts.json
contains the specification of the targets.
Tip: To specify this file, you can run this job through the application. After the job has completed, download the CLI script from the Recipe panel in the Transformer page. The downloaded publishopts.json
file contains the specification for the targets you just executed. See Recipe Panel.
Example publishopts.json
file:
{ "file": [ { "path": "hdfs://hadoop:50070/trifacta/queryResults/admin@trifacta.local/POS-r01.csv", "action": "create", "format": "csv", "header": true, "asSingleFile": true, "compression": "none" }, { "path": "hdfs://hadoop:50070/trifacta/queryResults/admin@trifacta.local/POS-r01.json", "action": "create", "format": "json", "header": false, "asSingleFile": false, "compression": "none" } ], "hive": [ { "databaseName":"default", "tableName":"POS-r01", "action":"overwrite" } ] }
NOTE: All of the following properties require valid values, unless noted.
File targets:
Property | Description |
---|---|
path | Full path to the target file. Path must include the protocol identifier, such as hdfs:// and the port number. |
action | The action to take on the file. Supported actions:
Some limitations apply to these options. See Run Job Page. |
format | Output format for the file. Supported formats:
|
header | If set to |
asSingleFile | If set to If set to |
compression | (optional) This property can be used to specify any compression to apply to a text-based file. Supported compression formats:
If this is not specified, then no compression is applied to the output file. |
Hive targets:
Property | Description |
---|---|
databaseName | Name of the database. NOTE: The database must contain at least one table. |
tableName | Name of the table in the database to which to write. |
action | The write action to apply to the table. Supported actions:
Some limitations apply to these options. See Run Job Page. |
This page has no comments.