D toc |
---|
...
As needed, you can insert custom SQL statements as part of the data import process. These custom SQL statements allow you to pre-filter the rows and columns of relational source data within the database, where performance is faster. This query method can also be used for wider operations on relational sources from
...
within
D s product | ||
---|---|---|
|
Limitations
General
Warning |
---|
All queries are blindly executed. It is your responsibility to ensure that they are appropriate. Queries like |
...
Info |
---|
NOTE: Column names in custom SQL statements are case-sensitive. Case mismatches between SQL statement and your datasource can cause jobs to fail. |
SQL statements are stored as part of the query instance for the object. If the same query is being made across multiple users using private connections, the SQL must be shared and entered by individual users.
Each statement must be terminated with a semi-colon (Info NOTE: If a dataset created from custom SQL is shared, collaborators are not permitted to edit the custom SQL.
;
) and a newline:Code Block SELECT * FROM myDB .
myTable;SQL statements must be valid for the syntax of the target relational system. . Syntax examples are provided below.
Info NOTE: Standard SQL syntax is supported. Legacy SQL syntax is not supported.
- If you modify the custom SQL statement when reading from a source, all samples generated based on the previous SQL are invalidated.
Declared variables are not supported.
- Common Table Expressions (CTEs) are not supported.
- For each SQL statement, all columns must have an explicit name. Example:
Function references such as:
Code Block UPPER(col)
Must be specified as:
Code Block UPPER(col) as col_name
...
All single-statement SQL queries must begin with a
SELECT
statement keyword.Selecting columns with the same name, even with
"*"
, is not supported and generates an ambiguous column name error.Tip Tip: You should use fully qualified column names or proper aliasing. See Column Aliasing below.
Users are encouraged to provide fully qualified path to table being used. Example:
Code Block SELECT "id", "value" FROM "public"."my_table";
- You should use proper escaping in SQL.
...
- In the Library page, click Import Data.
- In the Import Data page, select a connection.
- Within your source, locate the table from which you wish to import. Do not select the table.
Click the Preview icon to review the columns in the dataset.
Tip Tip: You may wish to copy the database, table name, and column names to a text editor to facilitate generating your SQL statement.
Click Create Dataset with SQL. Enter or paste your SQL statement.
Warning Through the custom SQL interface, it is possible to enter SQL statements that can delete data, change table schemas, or otherwise corrupt the targeted database. Please use this feature with caution.
Info NOTE: If this button is disabled and you have enabled the custom SQL feature, the connection that you are using may lack credentials. Please review the connection definition.
D caption Create Dataset with SQL dialog
See 160412721 below.
To test the SQL, click Validate SQL. For details, see below.
To apply the SQL to the import process, click Create Dataset.
The customized source is added to the right panel. To re-edit, click Custom SQL.
Complete the other steps to define your imported dataset.
When the data is imported, it is altered or filtered based on your SQL statement.
- After dataset creation, you can modify the SQL, if needed. See Dataset Details Page.
Create with Variables
If parameterization has been enabled, you can specify variables as part of your SQL statement. Suppose you had table names like the following:
...
You can insert an inline variable as part of your custom SQL to capture all of these variations.
D caption |
---|
Insert variables in your custom SQL |
In the above, custom SQL has been added to match the first example table. When the value is highlighted and the icon is clicked, the highlighted value is specified as the default value.
...
.
...
Provide a name for the variable, and click Save.
Through the Run Job page, you can specify overrides for the default value, so the same job definition can be used across all matching tables without much modification. . For more information, see Run Job Page.
For more information on this feature, see Overview of Parameterization.
Create with timestamp parameter
...
Code Block |
---|
SELECT * FROM <YYYY-MM-DD> ; |
If the job executes on May 28th, 2019, then this parameter resolves as 2019-05-28
and gathers data from that table.
D caption |
---|
Insert timestamp parameter |
...
- All
SELECT
statements are planned, which includes syntactical validation. However, these statements are not executed. Validation should be a matter of a few seconds.
Examples
Here are some basic SQL
...
For more information on SQL syntax and supported variations, see Supported SQL Syntax. D s also
Basic Syntax
Your SQL statements must be valid for the syntax expected by the target relational system. In particular, object delimiters may vary between systems.
Info |
---|
NOTE: The proper syntax depends on your database system. Please consult the documentation for your product for details. |
Tip |
---|
Tip: Although some relational systems do not require object delimiters around column names, it is recommended that you add them to all applicable objects. |
Tip |
---|
Tip: Avoid using column type identifiers (e.g. |
Info |
---|
NOTE: In the following sections, Oracle syntax is used in the examples. Please modify the examples for your target system. |
Oracle syntax
Object delimiter: double-quote
Example syntax:
Double quotes required around database and table names and not required around column names.
Code Block |
---|
SELECT "column1","column2" FROM "databaseName"."tableName" |
SQL Server syntax
Object delimiter: none
Example syntax:
Code Block |
---|
SELECT "column1","column2" FROM "databaseName"."tableName" |
PostgreSQL syntax
Object delimiter: double-quote
Example syntax:
Double quotes required around database, table names, and column names.
Code Block |
---|
SELECT "column1","column2" FROM "databaseName"."tableName" |
BigQuery syntax
Object delimiters:
- Back-ticks around datasets (database/table combination)
- No quotes around column/field references
- Double-quotes can be used around mock data in a SELECT statement
Example syntax:
Code Block |
---|
SELECT column1,column2 FROM `databaseName.tableName` |
For more information, see https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax.
Column Aliasing
If your select statement results in multiple columns with same name, the query fails to validate or fails on execution, such as selecting all columns in a JOIN. In these cases, columns must be properly aliased.
Info |
---|
NOTE: This error will be caught either during validating or during dataset import. |
For example, in the following JOIN, the EMPLOYEE
and DEPARTMENT
tables have column names department_id
and department_id
.
Code Block |
---|
SELECT * FROM EMPLOYEE INNER JOIN DEPARTMENT ON (department_id = department_id) |
The above query generates an error. Columns must be properly aliased, as in the following:
Code Block |
---|
SELECT e.id, e.department_id, e.first_name, e.last_name, d.department_name FROM EMPLOYEE AS E INNER JOIN DEPARTMENT d ON (e.department_id = d.department_id) |
Collect Whole Table
Code Block |
---|
SELECT * FROM "DB1"."table2" |
Filter Columns
Code Block |
---|
SELECT lastName,firstName FROM "DB1"."table2 |
Filter Rows
Code Block |
---|
SELECT lastName,firstName FROM "DB1"."table2" WHERE invoiceAmt > 10000 |