You can use the Job Details page to explore details about successful or failed jobs, including outputs, dependency graph, and other metadata. Download results to your local desktop or, if enabled, explore a visual profile of the data in the results for further iteration on your recipe.
Cancel job: Click this button to cancel your job while it is still in progress.
NOTE: This option may not be available for all running environments.
NOTE: If you do not have permission to cancel a job, the appropriate permissions must be added to your IAM role by an administrator. The default IAM role available with the product has these permissions, but these permissions may not be present if you are using a custom or personal IAM role in your account. For more information on IAM permissions, see Required Dataprep User Permissions.
NOTE: In some cases, the product is unable to cancel the job from the application. In these cases, click View in Dataflow, and from there you can cancel the job in progress.
View dataflow job: View the job that was executed on .
View BigQuery job: If the job was able to be executed in BigQuery, you can review the job in the BigQuery console.
NOTE: When you view the job in BigQuery, you are using your own credentials to access the BigQuery console, which may be different from the service account that was used to execute the job. In this case, the job in BigQuery may be reported having errors when viewed using your credentials, when the job succeeded using the service account. To see the job properly, the
Download profile as JSON: If visual profiling was enabled for the job, you can download a JSON representation of the profile to your desktop.
Tip: When you download your JSON profile, any rules applied to the generated results are included in the profile. Search for
In the Overview tab, you can review the job status, its sources, and the details of the job run.
NOTE: If your job failed, you may be prompted with an error message indicating a job ID that differs from the listed one. This job ID refers to the sub-job that is part of the job listed in the Job summary.
You can review a snapshot of the results of your job.
The output data section displays a preview of the generated output of your job.
NOTE: This section is not displayed if the job fails.
You can also perform the following:
View: If it is present, you can click the View link to view the job results in the datastore where they were written.
NOTE: The View link may not be available for all jobs.
Download: If it is present, click the Download link to download the generated job results to your local desktop.
NOTE: If you chose to generate a profile of your job results, the transformation and profiling tasks may be combined into a single task, depending on your environment. If they are combined and profiling fails, any publishing tasks defined in the job are not launched. You may be able to ad-hoc publish the generated results. See below.
If present, you can click the Show Warnings link to see any warnings pertaining to recipe errors, including the relevant step number.
You can also review the outputs generated as a result of your job. To review and export any of the generated results, click View all. See Outputs Destinations tab below.
Job ID: Unique identifier for the job
Tip: If you are using the REST APIs, this value can be used to retrieve and modify specifics related to this job. For more information, see API Reference.
Queued:Job has been queued for execution.
Running:Job is in progress.
Completed: Job has successfully executed.
NOTE: Invalid steps in a recipe are skipped, and it's still possible for the job to be executed successfully.
Failed: Job failed to complete.
NOTE: You can re-run a failed job from the Transformer page. If you have since modified the recipe, those changes are applied during the second run. See Transformer Page.
Manual - Job was executed through the application interface.
Scheduled - Job was executed according to a predefined schedule. See Add Schedule Dialog.
For jobs sourced from relational datasets, you can optionally enable SQL-based optimizations, which apply some of the steps specified in your recipe back in the datasource, where they can be executed before the data is transferred to the running environment for execution. Using these optimizations means faster performance based on a lower volume of data transfer.
When optimizations have been applied to your flow, they are listed on the Overview tab:
If an optimization is disabled or was not applied to the job run, it is not listed.
If the job has successfully completed, you can review the set of generated outputs and export results.
Output Destinations tab
For each output, you can do the following:
View details: View details about the generated output in the side bar.
Download result: Download the generated output to your local desktop.
NOTE: Some file formats may not be downloadable to your desktop. See below.
Create imported dataset: Use the generated output to create a new imported dataset for use in your flows. See below.
NOTE: This option is not available for all file formats.
Click one of the provided links to download the file through your browser to your local desktop.
NOTE: If these options are not available, data download may have been disabled by an administrator.
Optionally, you can turn your generated results into new datasets for immediate use in . For the generated output, select Create imported dataset from its context menu.
NOTE: When you create a new dataset from your job results, the file or files that were written to the designated output location are used as the source. Depending on your backend datastore permissions are configured, this location may not be accessible to other users.
After the new output has been written, you can create new recipes from it. See Build Sequence of Datasets.
If is connected to an external storage system, you may publish your job results to it. Requirements:
For more information, see Publishing Dialog.
If the output for your job included one or more pre- or post-job SQL script executions, you can review the status of their execution during the job.
NOTE: If a SQL script fails to execute, all downstream phases of the job fail to execute.
Tip: If the SQL script execution for this job encountered errors, you can review those errors through this tab. For more detailed information, click Download logs.
SQL scripts tab
Run before data ingest- script was executed pre-job.
Run after data publish- script was executed post-job, after the job results had been written.
Status: Current status and execution duration of the SQL script.
NOTE: If you have multiple SQL scripts for each settings, they may execute in parallel. For example, if you created three pre-job SQL scripts, there is no guarantee that they executed in the order in which they are listed.
Hover over a SQL script entry and click View details.
In the SQL script details window, you can review:
Any error messages that occurred during execution.
Tip: To review log information for any error messages, click Download logs.
For more information on these types of SQL scripts, see Create Output SQL Scripts.
Review the visual profile of your generated results in the Profile tab. Visual profiling can assist in identifying issues in your dataset that require further attention, including outlier values.
NOTE: This tab appears only if you selected to profile results in your job definition. See Run Job Page.
In particular, you should pay attention to the mismatched values and missing values counts, which identify the approximate percentage of affected values across the entire dataset. For more information, see Overview of Visual Profiling.
NOTE: The computational cost of generating exact visual profiling measurements on large datasets in interactive visual profiles severely impacts performance. As a result, visual profiles across an entire dataset represent statistically significant approximations.
NOTE: treats null values as missing values. Imported values that are null are generated as missing values in job results (represented in the gray bar). See Manage Null Values.
Tip: Mouse over the color bars to see counts of values in the category.
Tip: Use the horizontal scroll bar to see profiles of all columns in wide datasets.
In the lower section, you can explore details of the transformations of individual columns. Use this area to explore mismatched or missing data elements in individual columns.
Depending on the data type of the column, varying information is displayed. For more information, see Column Statistics Reference.
Tip: You should review the type information for each column, which is indicated by the icon to the left of the column.
If you have defined data quality rules for your recipe, those rules are applied to the generated results. In the Rules tab, you can review the application of the rules across your entire dataset.
NOTE: To see the results of your rules on the entire dataset, you must enable profiling for the job. See Run Job Page.
Tip: When you download your profile as JSON, the rule definitions are included.
Data quality rules are created in the Transformer page. For more information, see Data Quality Rules Panel.
In this tab, you can review a simplified representation of the flow from which the job was executed. This flow view displays only the recipes and datasets that contributed to the generated results.
Tip: To open the full flow, you can click its name in the upper-left corner.
Dependency graph tab
You can zoom the dependency graph canvas to display areas of interest in the flow graph.
The zoom control options are available at the top-right corner of the dependency graph canvas. The following are the available zoom options:
Tip: You can use the keyboard shortcuts listed in the zoom options menu to make quick adjustments to the zoom level.
In the Data sources tab, you can review all of the sources of data for the executing recipe.
Data sources tab
NOTE: If a flow is unshared with you, you cannot see or access the datasources for any jobs that you have already run on the flow, including any PDF profiles that you generated. You can still access the job results. This is a known issue.
If your flow references parameters, you can review the state of the parameters at the time of job execution.
NOTE: This tab appears only if the job is sourced from a flow that references parameters. For more information, see Overview of Parameterization.
When a webhook task has been triggered for this job, you can review the status of its delivery to the target system.
200- message was delivered successfully.