Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

D toc

d-excerpt

In 

D s product
rtrue
parameterization enables you to apply dynamic values to the data that you import and that you generate as part of job execution.

Parameter types:

  • Environment Parameters: A workspace administrator or project owner can specify parameters that are available across the environment, including default values for them.
  • Dataset Parameters: You can parameterize the paths to inputs for your imported datasets, creating datasets with parameters. For file-based imported datasets, you can parameterize the bucket where the source is stored.
  • Flow Parameters: You can create parameters at the flow level, which can be referenced in any recipe in the flow.
  • Output Parameters: When you run a job, you can create parameters for the output paths for file- or table-based outputs.

These parameters can be defined by timestamp, patterns, wildcards, or variable values that you specify at runtime.

Environment Parameters

Project owners or workspace administrators can define parameters that apply across the project or workspace environment. These parameters can be referenced by any user in the environment, but only a user with admin access can define, modify, or delete these parameters.

...

Tip

Tip: When specifying an imported dataset with parameters, you should attempt to be as specific as possible in your parameter definitions.

Info

NOTE: When importing one or more Excel files as a parameterized dataset, you select worksheets to include from the first file. If there are worksheets in other Excel files that match the names of the worksheets that you selected, those worksheets are also imported. All worksheets are unioned together into a single imported dataset with parameters. Pattern-based parameters are not supported for import of Excel worksheets.

Mismatched Schemas

D s product
 expects that all datasets imported using a single parameter have schemas that match exactly. The schema for the entire dataset is taken from the first dataset that matches for import.

...

  • You cannot create datasets with parameters from uploaded data.
  • You cannot create dataset with parameters from multiple file types.
    • File extensions can be parameterized. Mixing of file types (e.g. TXT and CSV) only works if they are processed in an identical manner, which is rare.
    • You cannot create parameters across text and binary file types.

  • For datasources that require conversion, such as Excel, PDF, or JSON files, you can create a dataset with parameters from a maximum of 500 converted files.
  • Parameter and variable names can be up to 255 characters in length.
  • For regular expressions, the following reference types are not supported due to the length of time to evaluate:
    • Backreferences. The following example matches on axabxb, and cxc yet generates an error:

      Code Block
      ([a-c])x\1
    • Lookahead assertions: The following example matches on a, but only when it is part of an ab pattern. It generates an error:

      Code Block
      a(?=b)
  • For some source file types, such as Parquet, the schemas between source files must match exactly.

...

Info

NOTE: Matching file path patterns in a large directory can be slow. Where possible, avoid using multiple patterns to match a file pattern or scanning directories with a large number of files. To increase matching speed, avoid wildcards in top-level directories and be as specific as possible with your wildcards and patterns.

Options:

  • You can choose to search nested folders for files that match your specified pattern.

Tip

Tip: If your imported dataset is stored in a bucket, you can parameterize the bucket name, which can be useful if you are migrating flows between environments or must change the bucket at some point in the future.

For more information, see Create Dataset with Parameters.

...

  • Literal values: These values are always of String data type.

    Tip

    Tip: You can wrap flow parameter references in your transformations with one of the PARSE functions. For more information, see Create Flow Parameter.

    Info

    NOTE: Wildcards are not supported.

  • D s lang
    itempatterns
    . For more information, see Text Matching.
  • Regular expressions.

...

When you are creating or editing a publishing action in the Run Jobs Job page, you can click the Parameterize destination link that appears in the right panel.

...

  • Data sources tab: For file-based parameterized datasets, you can review the files that were matched at runtime for the specified parameters.
  • Parameters tab: View the parameter names and values that were used as part of the job, including the list of matching datasets. 

See Job Details Page.

D s also
labelparameterization