Use Sample to limit the data stream to a specified number, percentage, or random set of rows. In addition, the Sample tool applies the selected configuration to the columns you want to group by.
Tool Components
Figure: Sample Tool with anchors.
The Sample tool has 2 anchors.
- Input anchor: Use the input anchor to select the data you want to sample.
- Output anchor: Outputs the sampled data.
Configure the Tool
- Select a sampling method. N is selected using the textbox following the sampling methods and is limited to 16 characters. The options are...
- First N Rows: Returns every row in the data from the first through row N.
- Last N Rows: Starting from the row that is N rows away from the end of the data, returns every row through to the end of the data.
- 1 of Every N Rows will be Sampled: Returns the first row of every group of N rows.
- First N% of Rows: Returns N percent of rows. This option requires the data to pass through the tool twice: once to calculate the count of rows and again to return the specified percent of rows.
1 in N Chance to Include Each Row: Randomly determines if each row is included in the sample, independent of the inclusion of any other rows.
NOTE: The option 1 in N Chance to Include Each Row returns an approximation. For example, if you have 1,000 rows, select a random sample, and specify N as 10, you might expect the tool to return 100 rows. However, it could return between 75 and 150 rows.
- Enter a number in N=to specify the value for N.
- Sample records based on order: Select the Column Name and Order for the columns you want to sample.
Columns to Group By: If groups are specified, N rows are returned for each group. This option is not available for the 1 in N Chance to Include Each Row sampling method.
NOTE: If you select to group by a column named City, specify N as 2, and select First N Rows, Designer Cloud returns the first 2 rows for each City.
This page has no comments.