By default, the applies its own type inference to datasets when they are imported and again when new steps are applied to the data. This section provides information on how you can configure where type inference is applied in the platform.
Data types are inferred by the when:
Tip: You can use the Change Column Type transformation to override the data type inferred for a column. However, if a new transformation step is added, the column data type is re-inferred, which may override your specific typing. You should consider applying Change Column Type transformations as late as possible in your recipes.
For more information on how the applies data types to specific sources of data on import, see Type Conversions.
Optionally, you can choose to disable type inference for schematized sources. A schematized source includes column data type information as part of the object definition. The following schematized sources are supported for import into the :
All JDBC sources
|Type inference on schematized sources||Setting||Behavior|
All imported datasets from schematized sources are automatically inferred by the type system in the .
The inferred data types may be different from those in the source. When the dataset is loaded, data types can be applied to individual columns through the application.
Users can apply overrides for:
For schematized data sources, type inference is not automatically inferred by .
Data type information is taken from the source schema and applied where applicable to the dataset. If there is no corresponding data type in the , the data is imported as String type.
Users can apply overrides for:
Please perform the following configuration change to disable type inference of schematized sources at the global level.
Change the following configuration setting to
When a dataset is imported into the , a volume of data is read from the source, up to the parameterized limits below. These limits define the maximum size of the data read for:
Tip: You can raise these limits gradually if you are noticing issues with either data inference or row splits. Raising these values significantly can impact load performance in the Transformer page.
|webapp.loadLimitForSplitInference||Maximum number of bytes to be read from an imported dataset for initial inference for splitting rows. Default value is |
|webapp.loadLimitForTypeInference||Maximum number of bytes to be read from an imported dataset for initial inference of column data types. Default value is |
In the application, type inference can be applied to your imported data through the following mechanisms.
You can specify individual connections to apply or not apply when the connection is created or edited.
NOTE: When Default Column Data Type Inference is disabled for an individual connection, can still be applied on import of individual datasets.
For more information, see Create Connection Window.
When type inference has been disabled globally for schematized sources, you can choose to enable or disable it for individual source import.
Tip: To compare how data types are imported from the schematized source or when applied by the , you can import the same schematized source twice. The first instance of the source can be imported with type inference enabled, and the second can be imported with it disabled.
In the Import Data page, click Edit Settings on the data source card.
For more information, see Import Data Page.
Type inference is automatically enabled in the data grid. It cannot be disabled.
|Tip: You can override the by applying a Change Column Type transformation.|
When a new transformation step is applied, each column is re-inferred for its .
When you generate results, the current data types in the data grid are applied to the generated results.
If the publishing destination is a schematized environment, the generated results are written to the target environment based on the environment type. These data type mappings cannot be modified.
For more information on output types, see Type Conversions.