Find and Fix Errors
In the Designer Cloud application, it is very easy to identify where there are errors in your data. What is truly innovative is how you correct them:
- Identify missing or mismatched data by color-coded bars in column data.
- Select a bar.
- Suggestions are offered in a set of cards on the right panel.
- Click a suggestion, and immediately see the effects of the suggested transformation previewed in the data grid.
- If the transformation needs tweaking, you can edit the transformation as needed.
- If the transformation is not the correct one, click another suggestion.
- When satisfied, you add the transformation, and your sample of data is transformed.
Through this series of seeing, selecting, and refining issues in your sampled data, you can address basic errors in data mismatches, missing data, non-standard values, outlier values, and much more to improve the overall consistency and quality of your data.
For more information, see Find and Fix Bad Data.
Add New Data
You can add in new data to your dataset through the following methods.
Create new columns in your dataset containing literal values, function outputs, or values from other columns, including extraction of values into new columns.
For more information, see Create Column.
You can insert references to metadata about your datasources within your dataset. Source row and path information can be added as new data.
For more information, see Add Dataset Metadata.
Join or Union Datasets
Combine in data from other sources using joins or unions.
- A join combines two datasets based on values in one or more shared columns. For example, if both datasets contain a
productIdcolumn, then the rows where the
productIdvalues match can be combined together.
- A union appends two or more similar datasets together. Rows of the second dataset are added to the end of the first.
Into your recipe in development, you can apply these operations to:
- other imported datasets
- the outputs of recipes in the same flow
- the outputs of recipes from other flows
For more information, see Combine Datasets.
Reshape Your Data
You can change the composition of rows and columns in your dataset through transformation.
The following types of transformations can be used to reshape or completely replace the columns and rows in your dataset:
- Split: Split a column based on one or more known delimiters or based on index positions in the data. See Split Column Data.
- Aggregation: Perform computations across a set of grouped rows, generating the results in a new column or a reshaped table. See Create Aggregation Calculations.
- Pivot: Create pivot tables based on one or more calculations and selected fields. See Create Pivots.
- Select: Select a set of columns to completely replace the current dataset. See Select Columns.
Nest and Unnest
Data in separate columns can be combined together into single columns as arrays or objects (maps). Similarly, columns of these object types can be expanded as new columns or new rows in your dataset. For more information, see Nested Data Basics.
This page has no comments.