Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

...

D s
snippetBasic

:SESSION
d-codelang-syntax
RawWrangletrue
Typeref
showNotetrue
WrangleTextwindow value
:session(Timestamp, 1, hour) order:'Timestamp'

session(Timestamp, 1, hour) order:'Timestamp'

Output:

...

Returns session identifiers for groups of rows based on 1-hour segments.

D s
snippetSyntax

SESSION
d-code-lang-syntax
RawWrangletrue
Typesyntax
showNotetrue
WrangleTextwindow value:
session(col_ref, k_integer,unit_of_time) order: order_col [group: group_col]

session(col_ref, k_integer,unit_of_time) order: order_col [group: group_col]


ArgumentRequired?Data TypeDescription
col_refYstringName of column whose values are applied to the function
k_integerYinteger (positive)Length of a session, in combination with unit_of_time argument
unit_of_timeYstringString literal that indicates the units of time to define a session's duration

...

DateTimeCustIdAction
2/1/169:23:00 AMC001change account settings
2/1/169:23:58 AMC003complete order
2/1/169:20:00 AMC002login
2/1/169:20:22 AMC003login
2/1/169:20:41 AMC001login
2/1/169:24:52 AMC004login
2/1/1611:24:21 AMC001login
2/1/169:24:18 AMC001logout
2/1/169:24:49 AMC003logout
2/1/169:26:22 AMC002logout
2/1/169:24:10 AMC002search: bicycles
2/1/169:23:50 AMC002search: pennyfarthings
2/1/1611:56:09 PMC004search: unicycles

TransformTransformation:

This data makes more sense if it was organized by timestamp of each event. However, the timestamp information is spread across two fields: Date and Time. Your first step is to consolidate this data into a single field:

d-

...

trans
RawWrangletrue
p03Value'Timestamp'
Typestep
WrangleTextmerge col: Date,Time with: ' ' as: 'Timestamp'
p01NameColumns
p01ValueDate,Time
p02NameSeparator
p02Value''
p03NameNew column name
SearchTermMerge columns

You can now delete the two source columns. After they are deleted, you might notice that the Timestamp column is still typed as String data. This typing issue is caused by the AM/PM designators, which you can remove with the following transformtransformation:

d-

...

trans
RawWrangletrue
p03Value''
Typestep
WrangleTextreplace col: Timestamp on: ` {upper}{2}{end}` with: ''
p01NameColumn
p01ValueTimestamp
p02NameFind
p02Value` {upper}{2}{end}`
p03NameReplace with
SearchTermReplace text or pattern

Now that you have valid Datetime data, you can create session identifiers using the following transformtransformation:

d-

...

trans
RawWrangletrue
p03ValueTimestamp
Typestep
WrangleTextwindow value:

...

session(Timestamp, 5, minute) order: Timestamp group: CustId
p01NameFormulas
p01Valuesession(Timestamp, 5, minute)
p02NameGroup by
p02ValueCustId
p03NameOrder by
SearchTermWindow

The above transform creates session identifiers from the data in the Timestamp column for five-minute intervals. Data is initially grouped by CustId and then sorted by Timestamp before the SESSION function is applied.

You can choose to rename the generated column:

d-

...

trans
RawWrangletrue
p03Value'SessionId'
Typestep
WrangleTextrename mapping:[window,'SessionId']
p01NameOption
p01ValueManual rename
p02NameColumn
p02Valuewindow
p03NameNew column name
SearchTermRename columns

Results:

TimestampCustIdActionSessionId
2/1/2016 11:24:21C001login1
2/1/2016 9:20:41C001login2
2/1/2016 9:23:00C001change account settings2
2/1/2016 9:24:18C001logout2
2/1/2016 9:20:22C003login1
2/1/2016 9:23:58C003complete order1
2/1/2016 9:24:49C003logout1
2/1/2016 9:20:00C002login1
2/1/2016 9:23:50C002search: pennyfarthings1
2/1/2016 9:24:10C002search: bicycles1
2/1/2016 9:26:22C002logout1
1/31/2016 11:56:09 C004search: unicycles1
2/1/2016 9:24:52C004login2

...