Page tree

Trifacta Dataprep



Contents:

If you licensed Dataprep by Trifacta before Oct. 14, 2020, you are using the Dataprep by Trifacta Legacy product edition. On October 14, 2022, this product edition will be decommissioned by Google and will be no longer available for use. Current customers of this product edition are encouraged to transition to one of the product editions hosted by Trifacta. See Product Editions.

   

Contents:


Feature Availability: This feature is available in the following editions:

  • Dataprep by Trifacta Enterprise Edition
  • Dataprep by Trifacta Professional Edition
  • Dataprep by Trifacta Premium
  • Dataprep by Trifacta Standard

This section provides general information on how the Dataprep by Trifacta uses SQL to interact with your databases, including syntax requirements and examples.

Basic Syntax

Your SQL statements must be valid for the syntax expected by the target relational system. In particular, object delimiters may vary between systems. 

NOTE: The proper syntax depends on your database system. Please consult the documentation for your product for details.


Tip: Although some relational systems do not require object delimiters around column names, it is recommended that you add them to all applicable objects.


Tip: Avoid using column type identifiers (e.g. int) and other SQL keywords as object names. Some systems may generate invalid SQL errors.


NOTE: In the following sections, Oracle syntax is used in the examples. Please modify the examples for your target system.

Supported syntax by datastore

Individual datastores may have differences in the supported syntax. For more information, please see the documentation for your datastore. 

General Examples

Here are some basic SQL examples to get started.

Column aliasing

If your select statement results in multiple columns with same name, the query fails to validate or fails on execution, such as selecting all columns in a JOIN. In these cases, columns must be properly aliased.

NOTE: This error will be caught either during validating or during dataset import.

For example, in the following JOIN, the EMPLOYEE and DEPARTMENT tables have column names department_id and department_id

SELECT * FROM EMPLOYEE INNER JOIN DEPARTMENT ON (department_id = department_id);

The above query generates an error. Columns must be properly aliased, as in the following:

SELECT e.id, e.department_id, e.first_name, e.last_name, d.department_name FROM EMPLOYEE AS E INNER JOIN DEPARTMENT d ON (e.department_id = d.department_id);

Collect whole table

SELECT * FROM "DB1"."table2";

Filter columns

SELECT lastName,firstName FROM "DB1"."table2";

Filter rows

SELECT lastName,firstName FROM "DB1"."table2" WHERE invoiceAmt > 10000;

Multi-line statement for imported datasets

Feature Availability: This feature may not be available in all product editions.

The following example uses a multi-line SQL sequence to import a dataset:

NOTE: Multi-line SQL support is considered an advanced use case. This feature must be enabled.

The following example inserts values in the TABLE_INVENTORY table and then queries the table. It utilizes Oracle syntax:

INSERT INTO "SALES"."TABLE_INVENTORY" ("ID", "AVAILABILITY") VALUES (1, 10);
SELECT * FROM "SALES"."TABLE_INVENTORY";

This page has no comments.