In the Run Job page, you can specify transformation and profiling jobs for the currently loaded recipe. Available options include output formats and output destinations.
You can also configure the environment where the job is to be executed.
Tip: Jobs can be scheduled for periodic execution through Flow View page. For more information, see Add Schedule Dialog.
Tip: Columns that have been hidden in the Transformer page still appear in the generated output. Before you run a job, you should verify that all currently hidden columns are ok to include in the output.
Run Job Page
Select the environment where you wish to execute the job. Some of the following environments may not be available to you. These options appear only if there are multiple accessible running environments.
NOTE: Running a job executes the transformations on the entire dataset and saves the transformed data to the specified location. Depending on the size of the dataset and available processing resources, this process can take a while.
Tip: The application attempts to identify the best running environment for you. You should choose the default option, which factors in the available environments and the size of your dataset to identify the most efficient processing environment.
|D s product|
Spark: Executes the job using the Spark running environment.
Advanced Execution Options:
- If Spark job overrides have been enabled in your environment, you can apply overrides to the specified job. See Spark Execution Properties Settings.
Profile results and assess data quality rules: Optionally, you can enable this option to generate a visual profile of your job results. If your job contains recipes with data quality rules, those rules are applied to the generated results and displayed in the Job Details page.
NOTE: You must choose to profile your results if you wish to see the data quality rules applied to your results.
When the profiling job finishes, details are available through the Job Details page, including links to download results.
- Disabling profiling of your output can improve the speed of overall job execution.
- See Job Details Page.
NOTE: Percentages for valid, missing, or mismatched column values may not add up to 100% due to rounding. This issue applies to the Photon running environment.
Ignore recipe errors: Optionally, you can choose to ignore errors in your recipes and proceed with the job execution.
NOTE: When this option is selected, the job may be completed with warning errors. For notification purposes, these jobs with errors are treated as successful jobs, although you may be notified that the job completed with warnings.
Details are available in the Job Details page. For more information, see Job Details Page.
You can add, remove, or edit the outputs that are generated from this job. For more information, see Publishing Actions.
To execute the job as configured, click Run. The job is queued for execution.After a job has been queued, you can track its progress toward completion. See Job Details Page.
Run jobs via API
You can use the available REST APIs to execute jobs for known datasets. For more information, see API Reference.