This examples illustrates how you can extract component parts of a URL using specialized functions for the URL data type.



Your dataset includes the following values for URLs:



When the above data is imported into the application, the column is recognized as a URL. All values are registered as valid, even the numeric address.

To extract the domain and subdomain values:

You can use the  in the following transformation to extract protocol identifiers, if present, into a new column:

To clean this up, you might want to rename the column to protocol_URL.

To extract the path values, you can use the following regular expression:

NOTE: Regular expressions are considered a developer-level method for pattern matching. Please use them with caution. See Text Matching.

The above transformation grabs a little too much of the URL. If you rename the column to path_URL, you can use the following regular expression to clean it up:

Delete the path_URL column and rename the path_URL1 column to the deleted one. Then:

If you wanted to just see the values for the q1 parameter, you could add the following:


For display purposes, the results table has been broken down into separate sets of columns.

Column set 1:


Column set 2:

URLprotocol_URLsubdomain_URLdomain_URLsuffix_URL wwwexamplecom  examplecom www.appexamplecom www.some.appexamplecom some.appexamplecom someexamplecom  examplecom

Column set 3:

URLurlParamsurlParam_q1{"q1":"broken record"}{"q1":"broken record"}{"query":"khakis","app":"pants"}{"q1":"broken record", "q2":"broken tape",
"q3":"broken wrist"}
{"q1":"broken record"}