The Trifacta® platform supports a single global file encoding type, which is set to
UTF-8 by default. This file encoding type applies to all text files for the following operations:
- Loading the default sample and any subsequent random samples
- Running text-based jobs on the Trifacta Server or in Hadoop
NOTE: This setting applies only to text files. Binary types, such as Avro, are not affected by the global file encoding type.
NOTE: If you change this setting, datasets that were imported under the former encoding type are no longer valid. Instructions are provided below for updating them.
Configure Global File Encoding Type
Set the following parameter to the appropriate file encoding type:
- Save your changes and restart the platform.
NOTE: After you change the global encoding type, datasets that were imported under the old encoding type must be reloaded to the platform. For more information, see Update Sources.
Supported Global File Encoding Types for Input
After you have changed the global file encoding type, restart services. See Start and Stop the Platform.
You should try to create a dataset from source data of the selected encoding type.
After you have changed the global encoding type, datasets that were imported under the former encoding type are no longer valid.
- For each dataset imported under the old encoding type, upload a new version.
- For each recipe that used the old version of the imported dataset:
- Edit the recipe in the Transformer Page.
- Swap the source from the old version to the new one. For more information, see Flow View Page.
- Repeat for each imported and recipe combination.
Supported Global File Encoding Types for Output
Output files are written in UTF-8 encoding.
This page has no comments.