Page tree

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

D toc

Generates a new session identifier based on a sorted column of timestamps and a specified rolling timeframe.

The SESSION function takes three parameters:

  • reference to column containing Datetime values used to identify sessions.
  • Numeric value to identify the length of the rolling timeframe that demarcates a session.
  • Unit of measure for the length of the rolling timeframe.

Like other windowing, order and group parameters can be applied. You can use the group and order parameters to define the groups of records and the order of those records to which this transform is applied. 


NOTE: While not explicitly required, you should use the order parameter to define a column used to sort the dataset rows. This column should match the column reference in the SESSION function.

In the generated column, identifiers are assigned to each row based on the computed session to which the row data belongs. Session IDs begin at 1 and increment, within each group. 

  • For each new group, session identifiers begin with the value 1. As a result, session identifiers are unique only within the group, and the combination of group identifier and session identifier is a unique key within the dataset. 

This function works with the following transforms:

D s

D code

window value:SESSION(Timestamp, 1, hour) order:'Timestamp'

Output: Generates a new column, which contains session identifiers for groups of rows based on 1-hour segments.

D s

D code

window value:SESSION(col_ref, k_integer,unit_of_time) order: order_col [group: group_col]

ArgumentRequired?Data TypeDescription
col_refYstringName of column whose values are applied to the function
k_integerYinteger (positive)Length of a session, in combination with unit_of_time argument
unit_of_timeYstringString literal that indicates the units of time to define a session's duration

For more information on the order and group parameters, see Window Transform.

D s lang notes


Name of the Datetime column whose values are used to determine sessions. 

  • Multiple columns and wildcards are not supported.

D s

Required?Data TypeExample Value
YesString (column reference to Datetime values)myDates


Defines the length of a session as this number of units, which are defined in the unit_of_time parameter.


NOTE: The start of a new session is determined by the first record that is found outside the boundary of the previous session. It is not determined based on any interpretation of a fixed interval.

D s

Required?Data TypeExample Value
YesInteger (positive)24


Defines the length of each unit of time for purposes of defining the length of a session.

D s

Required?Data TypeExample Value
YesString literal. See below for list.hour

Supported values:  dayhour, millisecond, minutesecond

D s

Example - Assign session identifiers to timestamped events

The following dataset contains events from a web site, categorized by customer identifier. Timestamp information is stored in two separate columns, and the imported data is sorted by the Action column. Your goal is to generate session identifiers based on sessions of five minute intervals for each customer.


2/1/169:23:00 AMC001change account settings
2/1/169:23:58 AMC003complete order
2/1/169:20:00 AMC002login
2/1/169:20:22 AMC003login
2/1/169:20:41 AMC001login
2/1/169:24:52 AMC004login
2/1/1611:24:21 AMC001login
2/1/169:24:18 AMC001logout
2/1/169:24:49 AMC003logout
2/1/169:26:22 AMC002logout
2/1/169:24:10 AMC002search: bicycles
2/1/169:23:50 AMC002search: pennyfarthings
2/1/1611:56:09 PMC004search: unicycles


This data makes more sense if it was organized by timestamp of each event. However, the timestamp information is spread across two fields: Date and Time. Your first step is to consolidate this data into a single field:

D code

merge col: Date,Time with: ' ' as: 'Timestamp'

You can now drop the two source columns. After they are dropped, you might notice that the Timestamp column is still typed as String data. This typing issue is caused by the AM/PM designators, which you can remove with the following transform:

D code

replace col: Timestamp on: ` {upper}{2}{end}` with: ''

Now that you have valid Datetime data, you can create session identifiers using the following transform:

D code

window value:SESSION(Timestamp, 5, minute) order: Timestamp group: CustId

The above transform creates session identifiers from the data in the Timestamp column for five-minute intervals. Data is initially grouped by CustId and then sorted by Timestamp before the SESSION function is applied.

You can choose to rename the generated column:

D code

rename mapping:[window,'SessionId']


2/1/2016 11:24:21C001login1
2/1/2016 9:20:41C001login2
2/1/2016 9:23:00C001change account settings2
2/1/2016 9:24:18C001logout2
2/1/2016 9:20:22C003login1
2/1/2016 9:23:58C003complete order1
2/1/2016 9:24:49C003logout1
2/1/2016 9:20:00C002login1
2/1/2016 9:23:50C002search: pennyfarthings1
2/1/2016 9:24:10C002search: bicycles1
2/1/2016 9:26:22C002logout1
1/31/2016 11:56:09 C004search: unicycles1
2/1/2016 9:24:52C004login2


  • Dataset is grouped by CustId, but the order of those groupings is determined by the first timestamp for each customer. So, C003 data appears before C002
  • C001 and C004 data result in two separate sessions. 

D s also