Find and Fix Errors
|D s webapp|
- Identify missing or mismatched data by color-coded bars in column data.
- Select a bar.
- Suggestions are offered in a set of cards on the right panel.
- Click a suggestion, and immediately see the effects of the suggested transformation previewed in the data grid.
- If the transformation needs tweaking, you can edit the transformation as needed.
- If the transformation is not the correct one, click another suggestion.
- When satisfied, you add the transformation, and your sample of data is transformed.
Select errors in your data, and review AI-driven suggestions for how to correct. Make the change on the spot.
Through this series of seeing, selecting, and refining issues in your sampled data, you can address basic errors in data mismatches, missing data, non-standard values, outlier values, and much more to improve the overall consistency and quality of your data.
For more information, see Find and Fix Bad Data.
Add New Data
You can add in new data to your dataset through the following methods.
Create new columns in your dataset containing literal values, function outputs, or values from other columns, including extraction of values into new columns.
Build a New Formula transformation to craft a new column of data containing custom functions or literal values.
For more information, see Create Column.
You can insert references to metadata about your datasources within your dataset. Source row and path information can be added as new data.
For more information, see Add Dataset Metadata.
Join or Union Datasets
Combine in data from other sources using joins or unions.
- A join combines two datasets based on values in one or more shared columns. For example, if both datasets contain a
productIdcolumn, then the rows where the
productIdvalues match can be combined together.
- A union appends two or more similar datasets together. Rows of the second dataset are added to the end of the first.
Into your recipe in development, you can apply these operations to:
- other imported datasets
- the outputs of recipes in the same flow
- the outputs of recipes from other flows
For more information, see Combine Datasets.
Reshape Your Data
You can change the composition of rows and columns in your dataset through transformation.
Change the structure of your data through menu-driven selection of rows, columns, and formulas.
The following types of transformations can be used to reshape or completely replace the columns and rows in your dataset:
- Split: Split a column based on one or more known delimiters or based on index positions in the data. See Split Column Data.
- Aggregation: Perform computations across a set of grouped rows, generating the results in a new column or a reshaped table. See Create Aggregation Calculations.
- Pivot: Create pivot tables based on one or more calculations and selected fields. See Create Pivots.
- Select: Select a set of columns to completely replace the current dataset. See Select Columns.
Nest and Unnest
Data in separate columns can be combined together into single columns as arrays or objects (maps). Similarly, columns of these object types can be expanded as new columns or new rows in your dataset. For more information, see Nested Data Basics.