When publishing to BigQuery, please complete the following steps to configure the table and settings to apply to the publish action.
- Select location: Navigate the BigQuery browser to select the database and table to which to publish.
- To create a new table, click Create a new table.
- Select table options:
NOTE: BigQuery does not support destinations with a dot (.) in the name. For more information, see https://cloud.google.com/bigquery/docs/tables#table_naming.
- New table: Enter a name for it. You may use a pre-existing table name, and schema checks are performed against it.
- Existing table: You cannot modify the name.
- Allow strict rules for type mismatch:
Feature Availability: This feature may not be available in all product editions. For more information on available features, see Compare Editions.
When enabled, the data types and their formatting set in the Dataprep by Trifacta application must match the types and formatting supported in BigQuery.
NOTE: By default, data is published to BigQuery using strict data type matching rules. When there are mismatches between Dataprep by Trifacta data types and BigQuery data types, the publication fails.
- When disabled, Dataprep by Trifacta data types are written to the closest approximation of time and formatting in BigQuery.
- For more information, see BigQuery Data Type Conversions.
- Output database: To change the database to which you are publishing, click the BigQuery icon in the sidebar. Select a different database.
- Publish actions: Select one of the following.
- Create new table every run: Each run generates a new table with a timestamp appended to the name.
- Append to this table every run: Each run adds any new results to the end of the table.
- Truncate the table every run: With each run, all data in the table is truncated and replaced with any new results.
Drop the table every run: With each run, the table is dropped (deleted), and all data is deleted. A new table with the same name is created, and any new results are added to it.
Merge the table every run: This publishing option merges the rows in your results with any existing rows in the target BigQuery table. For more information, see Merge Table Operations below.
More options: With each run, you can allow the following rules for strict type matching:
Datetime: Select to publish datetime for strict matching.
Array: Select to publish Dataprep by Trifacta Array types are published as BigQuery arrays.
NOTE: You can publish to BigQuery arrays only when the array data contains Dataprep by Trifacta primitive data types: String, Integer, Decimal, or Boolean.
For more information, see BigQuery Data Type Conversions.
- To save the publishing action, click Add or Update.
Merge Table Operations
The publishing option to merge table with every run allows you to update existing rows of data in the target table with corresponding values from your results (merge) and optionally to insert or delete matching rows from your results into the table.
- In the Table Settings panel, select Merge the table every run.
Primary keys for matching: Select one or more columns whose values determine if a row in your source results matches a row in the target. When these key values match, the following columns are updated.
NOTE: Columns of Array or Object data type cannot be used as key columns for merge operations.
- If the matching columns have duplicate rows in the target table, all rows in the target are updated.
- If the matching columns have duplicate rows in the source, the job fails.
Action on target table for matched rows: Select the action to apply to the target record when a match is found between the key columns:
- Update: The values from your results are updated into the columns specified below.
- Delete: The row in the target table is deleted.
Keys to be updated: Select one or more columns whose values are updated from your source results when values from the previous set of columns match. These are the columns that are merged into the table.
Tip: If All Columns is selected, all columns other than the matching columns are updated on a match. All columns continue to be updated even if the schema changes, and the matching columns remain in the schema.
- Insert source rows if no match in target:
- When selected, rows in your source that do not have a matching set of values in key columns are inserted into the table as new rows.
- When deselected, these unmatched rows are not written to the target table.
- Delete target rows if no match in source:
- When selected, all rows in the target that do not have a matching set of key fields your source results are deleted.
- When deselected, unmatched rows in the target are not deleted.
This page has no comments.