Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

...

Excerpt

In the Run Job page, you can specify transformation and profiling jobs for the currently loaded recipe. Available options include output formats and output destinations.

You can also configure the environment where the job is to be executed.

Info

NOTE: If the job is executed in an environment other than

D s photon
, the job is queued for execution in the environment. During job execution,
D s product
observes the job in progress and reports progress as needed back into the application.
D s product
does not control the execution of the job.


Tip

Tip: Jobs can be scheduled for periodic execution through Flow View page. For more information, see Add Schedule Dialog.

...

Info

NOTE: Running a job executes the transformations on the entire dataset and saves the transformed data to the specified location. Depending on the size of the dataset and available processing resources, this process can take a while.

Tip

Tip: The application attempts to identify the best running environment for you. You should choose the default option, which factors in the available environments and the size of your dataset to identify the most efficient processing environment.

Photon: Executes the job in Photon, an embedded running environment hosted on the same server as the 

D s product
rtrue

...

Info

NOTE: Percentages for valid, missing, or mismatched column values may not add up to 100% due to rounding. This issue applies to the Photon running environment.

Validate Schema: When enabled, the schemas of the datasources for this job are checked for any changes since the last time that the datasets were loaded. Differences are reported in the Job Details page as a Schema validation stage. 

Tip

Tip: A schema defines the column names, data types, and ordering in a dataset.

Fail job if dataset schemas change: When Validate Schema is enabled, you can set this flag to automatically fail the job if there are differences between the stored schemas for your datasets and the schemas that are detected when the job is launched.

Info

NOTE: If you attempt to refresh the schema of a parameterized dataset based on a set of files, only the schema for the first file is checked for changes. If changes are detected, the other files are contain those changes as well. This can lead to changes being assumed or undetected in later files and potential data corruption in the flow.

Tip

Tip: This setting prevents data corruption for downstream consumers of your executed jobs.

Tip

Tip: The default for validate schema is set at the workspace level. In the Run Job page, these settings are overrides for individual jobs.

For more information, see Overview of Schema Management.

Ignore recipe errors: Optionally, you can choose to ignore errors in your recipes and proceed with the job execution. 

...

You can use the available REST APIs to execute jobs for known datasets. For more information, see API Reference.

D s also
inCQLtrue
label((label = "job_ui") OR (label = "job"))