Skip to main content

Sample Tool

Use Sample to limit the data stream to a specified number, percentage, or random set of rows. In addition, the Sample tool applies the selected configuration to the columns you want to group by.

Tip

This tool has a One Tool Example. Visit Access Sample Workflows to learn how to access this and many other examples directly in Designer Cloud.

Tool Components

Sample Data anchors.png

Figure: Sample Tool with anchors.

The Sample tool has 2 anchors.

  • Input anchor: Use the input anchor to select the data you want to sample.

  • Output anchor: Outputs the sampled data.

Configure the Tool

  1. Select a sampling method. N is selected using the textbox following the sampling methods and is limited to 16 characters. The options are...

    • First N Rows: Returns every row in the data from the first through row N.

    • Last N Rows: Starting from the row that is N rows away from the end of the data, returns every row through to the end of the data.

    • Skip 1st N Rows: Returns all rows in the data starting after row N.

    • 1 of Every N Rows: Returns the first row of every group of N rows.

    • First N% of rows: Returns N percent of rows. This option requires the data to pass through the tool twice: once to calculate the count of rows and again to return the specified percent of rows.

    • 1 in N Chance to Include Each Row: Randomly determines if each row is included in the sample, independent of the inclusion of any other rows.

      Note

      The option 1 in N Chance to Include Each Row returns an approximation. For example, if you have 1,000 rows, select a random sample, and specify N as 10, you might expect the tool to return 100 rows. However, it could return between 75 and 150 rows.

  2. Enter a number in N= to specify the value for N.

  3. Columns to Group By (Optional): If groups are specified, N rows are returned for each group. This option is not available for the 1 in N Chance to Include Each Row sampling method.

    Note

    If you select to group by a column named City, specify N as 2, and select First N Rows, Designer Cloudreturns the first 2 rows for each City.