Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r0811

...

This section provides information on improvements to the

D s item
itemtype system
rtrue
.

Release 8.11

Mismatched values are no longer published as null values in CSV outputs

In prior releases, when a file was published in CSV format, any values that were mismatches for a column's data type were written as null values, which could lead to loss of data that was meaningful to downstream systems. 

Beginning in this release, mismatched values are written out in CSV format as String values by default.

Info

NOTE: The ability to write out mismatched values in CSV outputs is enabled by default in new flows and CSV publishing actions. For existing CSV outputs, the prior behavior is maintained.

Info

NOTE: This capability applies only to CSV outputs at this time. In the future, it will be applied to other non-schematized outputs, such as JSON.

Tip

Tip: When visual profiling is enabled, you can still identify the values in the generated results that are mismatched for their column data types.

As needed, you can configure the ability to write out mismatches for CSV outputs for individual publishing actions. For more information, see File Settings.

Release 8.7

Snowflake date publishing improvements

In prior releases, the

D s webapp
did not fully support Datetime values on publication. It published Date type values to Snowflake as follows:

Publishing Action→

Create/Drop Table

Append/Truncate


TargetTable→


Date

Datetime

Date

DateTime Appends 00:00:00 

Append 00:00:00

Append 00:00:00

Beginning in this release, the

D s webapp
publishes Date values as follows:

Publishing Action

Create/Drop

Append/Truncate


TargetTables→


Date

Datetime

Date

Date

Date

Datetime Appends 00:00:00

For more information, see Snowflake Data Type Conversions.

Release 8.2

None.

Release 8.0

Data type inference and row split inference run on more data

When an dataset is imported into the 

D s webapp
, a larger volume of data is read from it for the following processes:

Info

NOTE: The following is applied to datasets that do not contain schema information.

  • Split row inference: Patterns in the data are used to determine the end of a row of data. When a larger volume of data is read, there should be more potential rows to review, resulting in better precision on where to split the data into separate rows in the application.
  • Type inference: Patterns of data in the same column are used to determine the best 
    D s item
    itemdata type
     to assign to the imported dataset. A larger volume of data means that the application has more values for the same column from which to infer the appropriate data type.
Info

NOTE: An increased data volume should result in a more accurate split row and data type inferencing. For pre-existing datasets, this increased volume may result in changes to the row and column type definitions when a dataset is imported.


Tip

Tip: For datasets that are demarcated by quoted values, you may experience a change in how columns are typed.


If you notice unexpected changes in column data types or in row splits in your datasets:

  1. Type inference: You should move your recipe panel cursor to the top of the dataset to see if you must reassign data types.
  2. Split row inference: Create a new imported dataset, disabling type inference in the import settings. Check the splitrows transform to see if it is splitting the rows appropriately. For more information, see Import Data Page.

Release 7.5

PII - Improved matching for social security numbers

...