- Use your CSV file as the source for a new dataset within the flow containing the associated dataset.
- In your wrangled dataset, make sure that the first line of data is treated as the header. If not, add a
headertransform to your recipe.
- Open the other (source) dataset in the Transformer page.
From the Tools menu in the Transform Editor, select Union. See Transformer Page.
- Create a union:
- Include all columns from both datasets.
- Configure the step to perform the union by name, instead of by position.
- See Union Page.
- Add this step to your recipe.
- You should see one row in the union recipe that contains the new data.
- Sort your
Sort your data by a key value (e.g.
Determine an appropriate grouping parameter. This step is necessary to simplify the filling process when the job runs at scale. Ideally, you should choose a grouping column that contains a relative few number of values in it (e.g.
Fill values in the data rows with metadata column values. For each metadata column, add the following command, done here for the
source_systemcolumn of metadata.
window value: FILL(source_system, 1) order: business_id group:region
Repeat the above step for each metadata column you want to insert.
Tip: Copy the first successful step with the
FILLfunction. Paste it and modify it for each successive column.
- Drop the source metadata columns.
- Rename the
windowcolumns to use a more appropriate name.
- Delete the row containing the original metadata values.