This section provides an overview on how to apply parameters to the tables that you import as datasets.

During import of database tables through relational connections, you can apply parameters to the SQL query that you use to define the imported dataset. In some scenarios, you may need to define the table to import using a variable parameter or to parameterize the time value associated with a table name. Using parameters, you can define the specific tables that you use to bring in data from a relational database.

Following are the type of parameters you can apply for relational sources:

For more information, see Overview of Parameterization.

Import Parameterized Tables

While importing data, you parameterize relational tables by creating custom SQL statements to specify the dataset. By default, when you import a table from a relational source,  generates a SELECT * statement to import the entire table. The Custom SQL enables you to customize the query to pull the data from the source system.

The following are the prerequisites and procedures for parameterizing the relational sources table:

Prerequisites:

Create a custom SQL dataset

You can create a custom SQL dataset through the Import Data page.

Steps:

  1. In the , click Library in the left nav bar.
  2. In the Library page, click Import Data.
  3. From the left side of the Import Data page, select the relational connection from which to import.
  4. Depending on the type of relational connection, you may need to select the database or schema to browse. 
  5. Locate the tables to import. Take note of the table name or names.
  6. Click Create Dataset with SQL. The Create Dataset with SQL window is displayed.

In this window, you specify the SELECT statement to retrieve the data from a table or tables that you specify. 

NOTE: When specifying a SQL statement for your database, you are constructing a direct query of the database. You must use the syntax required by the database vendor.

For more information on creating datasets with SQL, see Create Dataset with SQL.

Parameterize dataset with a variable

A variable parameter enables you to insert variable into the query statement used to define your dataset. You can replace or highlight elements of the query to add parameters.

How to use variables:

Steps:

  1. Create a custom dataset using SQL. For more information, see Create a Custom SQL Dataset above.
  2. In the Create Dataset with SQL window, enter a SELECT* statement to retrieve data from the specified table. Click Validate SQL to verify that the query is properly specified. 
  3. Now, highlight the part of the query that you wish to parameterize. Click the Variable icon. 

    Define Variable Parameter
  4. In the Variable dialog, enter the following details:

    1. Name: Enter a display name for the variable.

    2. Default value: Enter a default value for the parameter.
  5. Click Save to save the parameter.

  6. To verify that your SQL is still valid, click Validate SQL.

  7. If the SQL is valid, click Create Dataset.

Parameterize dataset with a timestamp

Timestamp parameters can be helpful when you want to filter datasets based on date and time format, time zone, or exact and relative start time. You can apply timestamp parameters based on the specific region or time zone for which the data is generated. 

Steps:

  1. Create a custom dataset using SQL. For more information, see Create a Custom SQL Dataset above.
  2. In the Create Dataset with SQL window, enter a SELECT* statement to retrieve data from the specified table. Click Validate SQL to verify that the query is properly specified. 
  3. Now, highlight the part of the query that you wish to parameterize. Click the Timestamp icon.


    Define Timestamp Parameter
  4. In the Timestamp Parameter dialog, enter the following details:

      1. Timestamp format: Specify the format for timestamp values.
        1. Example: YYYY-MM-DD_hh_mm.
        2. Values can express both date and time elements. For more information on the available tokens for formatting date and time values, see Datetime Data Type.
      2. Timestamp value: Select the value to record in the path:
        1. Exact job start date: recorded timestamp in path is the start time of the job.
        2. Relative to the job start date: recorded timestamp in path is relative to the start time of the job according to the settings that you specify here.
      3. Time zone: Click Change to change the time zone recorded in the timestamp.
        1. Example: America/Los Angeles or Asia/Calcutta.
        2. For more information on the available time zones, see Supported Time Zone Values.

  5. Click Save to save the parameter.

  6. To verify that your SQL is still valid, click Validate SQL.

  7. If the SQL is valid, click Create Dataset.

Examples

In the following examples, you can see how dataset parameters can be used to pre-filter rows or parameterize the tables to include in your dataset.

NOTE: The syntax in these examples uses PostgreSQL syntax. For more information on basic syntax requirements, see Create Dataset with SQL.

Pre-filter rows from a table

Suppose you have a set of orders in a single table: myOrders. From this table, you want to be able to import a dataset that is pre-filtered for values in the customer identifier (custId) column. The following might be a query that you use for the customerOrders dataset to retrieve the orders for custId=0001 from the myOrders table in the transactions database:

SELECT "custId","ordDate","prodId","ordQty","unitPrice" FROM "transactions"."myOrders" WHERE "custId" = "0001"


Tip: This example uses a variable parameter.


Steps:

In this case, you can do the following:

  1. In the Create Dataset with SQL window, specify the above query. Click Validate SQL to verify that it works.
  2. Now, highlight the value 0001.
  3. Select the Variable icon. 
  4. Specify your variable:
    1. Name: myCustId
    2. Default Value: 0001
  5. Click Save.
  6. Before you create the dataset, validate the SQL.

Using the parameter:

Run a weekly job on daily tables

Suppose you have a database that captures log data into separate tables for each date. Each table is named according to the following pattern:

20201101-ServerLogs
20201102-ServerLogs
20201103-ServerLogs
20201104-ServerLogs
20201105-ServerLogs

Once per week, you want to run a job to ingest and process the log entries from the preceding week. 

The following could be a query that you use to retrieve all columns from a single file:

SELECT * FROM "logs"."20201101-ServerLogs"


Tip: This example uses a timestamp parameter.

Steps:

In this case, you can do the following:

  1. In the Create Dataset with SQL window, specify the above query. Click Validate SQL to verify that it works.
  2. Now, highlight the value 20201101.
  3. Select the Timestamp icon. 
  4. Specify your variable:
    1. Timestamp Format: YYYYMMDD
    2. Timestamp Value: Relative to job start date
      1. Select minus7days.
  5. Click Save.
  6. Before you create the dataset, validate the SQL.

Using the parameter:

When the job is executed, the imported dataset includes all of the tables whose timestamp format is within 7 days of the time when the job was started.

Parameterize entire query

You can turn the entire query of your custom SQL statement into a parameter. When you create your dataset with SQL, instead of entering any SQL in the window, create a variable parameter. For example, your parameter could be like the following:

If the SQL validates, then you can create the imported dataset using only this parameter.

How to use this parameter:

Tip: This example uses a variable parameter.