Page tree



Contents:

The cloud-based version of Trifacta Wrangler is now available! Read all about it, and register for your free account.

Generates the Unicode index value for the first character of the input string.
  • Unicode is a digital standard for the consistent encoding of the world's writing systems, so that representation of character sets is consistent around the world. 
  • The first 256 Unicode characters (0, 255) correspond to the ASCII character set.
  • If the function cannot resolve a Unicode character from the first character, it returns a null value. 

Basic Usage

Column reference example:

Output: Returns Unicode index value for the first character in the MyChar column. 

String literal example:

Output: Returns the integer 65.

Syntax


ArgumentRequired?Data TypeDescription
column_stringYstringName of the column or string literal the Unicode value of which is generated

For more information on syntax standards, see Language Documentation Syntax Notes.

column_string

Name of the column or string literal, the first character of which is converted to its corresponding Unicode value.

NOTE: If the input string contains multiple characters, the first character is mapped to its Unicode value, and the rest are ignored.

  • Missing string or column values generate missing string results.
  • String constants must be quoted ('Hello, World').
  • Multiple columns and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesString literal or column referencemyColumn

Examples

Example - char and unicode functions

In this example, you can see how the CHAR function can be used to convert numeric index values to Unicode characters, and the UNICODE function can be used to convert characters back to numeric values.

Source:

The following column contains some source index values:

index
1
33
33.5
34
48
57
65
90
97
121
254
255
256
257
9998
9999

Transformation:

When the above values are imported to the Transformer page, the column is typed as integer, with a single mismatched value (33.5). To see the corresponding Unicode characters for these characters, enter the following transformation:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula CHAR(index)
Parameter: New column name 'char_index'

To see how these characters map back to the index values, now add the following transformation:

Transformation Name New formula
Parameter: Formula type Single row formula
Parameter: Formula UNICODE(char_index)
Parameter: New column name 'unicode_char_index'

Results:

indexchar_indexunicode_char_index
1 1
33!33
33.5  
34"34
48048
57957
65A65
90Z90
97a97
122z122
254þ254
255ÿ255
256Ā256
257ā257
99989998
99999999

Note that the floating point input value was not processed. 

 

Your Rating: Results: 1 Star2 Star3 Star4 Star5 Star 11 rates

This page has no comments.