Contents:
Release 4.0.2
This release contains key bug fixes from Release 4.0.1.
What's New
No new features have been introduced.
Changes to System Behavior
None.
Key Bug Fixes
Ticket | Description |
---|---|
TD-25182 | Update NodeJS to 6.11.1 |
TD-25143 | Spark job gets stuck for flow with header filter and multiple map transform expressions |
TD-25090 | Spark job OOM error when failing over frequently on a Resource Manager High Availability cluster |
TD-25087 | Dictionary URL is incorrect in CDF for Spark jobs |
TD-25080 | Spark jobs with timestamp source columns yield empty columns |
TD-24965 | Job fails with "Unary operator LexiconCheck not supported" in Spark |
TD-24869 | Corrupted DotZlib.chm file in 4.0.1 RPM |
TD-24669 | Nginx Request URI length default is too low. |
TD-24464 | 'Python Error' when opening recipe with large number of columns and a nest |
TD-24409 | ArrayIndexOutOfBoundsException when UDF iterator reaches premature end |
TD-24322 | Nest transform creates a map with duplication keys. |
TD-23921 | In shared Hadoop cluster on Edge environment, valid relational connections do not appear in the GUI. |
TD-23920
| Support for equals sign (= ) in output path. |
TD-23904 | Results of Spark job show missing values, even though recipe step replaces them with a value. |
TD-23857 | Type registry fails to initialize when webapp process is relaunched. |
TD-23791 | Spark PyMultiStringReplaceUdf UDF code throws NPE when processing nested fields. |
TD-23780 | Unexpected dates appear in CSV output on Trifacta Photon job execution. |
TD-23722 | umask settings on output directories are not being respected for single-file output. |
TD-23646 | Adding a specific comment appears to invalidate earlier edit. |
TD-23645 | Spark unable to read recursive folders |
TD-23578 | Spark error doing split |
TD-23507 | No rows in random samples on CSM cluster. |
TD-23459 | Recipe upgraded from 3.1 to 3.2 becomes corrupted when new lookup is added.
|
TD-23457 | Webapp, batch-job-runner scaling issues |
TD-23358 | Flow with many dependencies hangs for 6 hours and then fails when executed in Spark on AWS |
TD-23276 | Generating large CLI script blocks client access |
TD-23111 | Long latency when loading complex flow views |
TD-23102 | Recipe showing MISSING for some Lookups after upgrade |
TD-23099 | View Results button is missing on Job Cards even with profiling enabled |
TD-22907 | Spark yarn-app log dump feature requires Alteryx account to have read/execute permissions to log aggregation folder. |
TD-22889 | Extremely slow UI performance for some actions |
TD-22796 | Java UDFs must support initSchema method to initArgs. |
TD-22313 | Use Node.js cluster module for easy scaling of webapp and VFS services |
TD-22291 | Columns created from UDFs do not work with column browser, column menus, or both, and they cannot be shown or hidden. |
New Known Issues
None.
Release 4.0.1
This release adds a few new features and addresses some known issues with the platform.
What's New
Admin, Install, & Config:
NOTE: Integration with MapR is not supported for this release.
- Support for Cloudera 5.10. See Supported Deployment Scenarios for Cloudera.
- Access to S3 buckets can now be controlled on a per-user basis. See Enable S3 Access.
- More parameters now available through the application. See Admin Settings Page.
- Send Spark jobs to a specified YARN queue. See Configure for Spark.
- You can now configure the default file format for jobs run on the Hadoop cluster. See Configure for Hadoop.
- Different file formats and other options can still be configured as part of the job. See Run Job Page.
- Support for CentOS/RedHat Linux 7.1 - 7.x on Alteryx node. See System Requirements.
Language:
- Apply optional
quoteEscapeChar
to identify escaped quote characters when splitting rows. - See Changes to the Language.
Changes to System Behavior
Application timeout behavior more consistent
In Release 4.0, the web application session timeout was set to 60 minutes by default, which caused inconsistent behaviors. See TD-22675 below.
In Release 4.0.1 and later, this session timeout was set to one month by default. This change returns the web application to the same setting as Release 3.2.1 and earlier.
NOTE: Beginning in Release 4.0, this setting is configurable. For more information on changing the session timeout, see Configure Application Limits.
Key Bug Fixes
Ticket | Description |
---|---|
TD-22675 | Session timeout behavior is inconsistent. Application seems to have some functionality after timeout. |
TD-22570 | After upgrade, some pre-upgrade jobs appear to point to deleted datasets. |
TD-22388 | S3 authorization mechanism does not support Signature Version 2 in Asia-Pacific and EU. |
TD-22220 | Dataset suddenly fails to load after upgrade from Release 3.2 because of type checking on an invalid recipe line. |
TD-19830 | Editing a Join or Union transform that includes a reference dataset (not in the same flow) may result in the unintentional removal of that reference dataset from the flow. |
TD-14131 |
This issue is fixed with the new |
TD-5783 | Prevent two-finger scroll in data grid from stepping back in the browser's history on Mac OS. |
New Known Issues
Ticket | Component | Description |
---|---|---|
TD-22864 | Compilation/Execution | Connection for Redshift publishing uses its own AWS access key and secret, which may be different from the per-user or system credentials. If the Redshift connection does not have read access to the data, publication fails. Workaround: Verify that the access key and secret for the Redshift connection has access to any source data that you wish to publish to Redshift. |
Release 4.0
This release features a single page for managing your flows, a faster Spark-based running environment on the Alteryx node, and a number of new Wrangle functions and capabilities. Details are below.
NOTE: Integration with MapR is not supported for this release.
What's New
Workspace:
- The new flow detail page includes a visual representation of your flow and detailed information about its datasets and recipes. From the Flow View page, users can swap datasets and run jobs, too. See Flow View Page.
- Send a copy of a flow to another user. See Send a Copy of a Flow.
Transformer Page:
- Column width settings now persist across transform steps, other actions, and user sessions. See Transformer Page.
- Users can now perform join and unions directly against imported datasets that contain schema information, such as Hive, JDBC, and Avro.
- Wrangle steps can now be displayed in natural language. See Data Grid Panel.
- New column menu shortcuts allow you to quickly assemble recipe steps from menu selections, based on a column's data type. See Column Menus.
- New column browser streamlines interactions involving multiple columns. See Column Browser Panel.
- Default quick scan samples are now collected over more of the data source, the first 1 GB. Administrators can now modify this size. See Configure Application Limits.
- For the Spark running environment, you can enable generation of random samples across the entire dataset. See Configure for Spark.
Profiling:
- Enhanced pattern profiling enables streamlined processing of fixed-width datasets. See Parse Fixed-Width File and Infer Columns.
Ingestion:
- New Custom SQL query options for Hive and relational sources enables pre-filtering of rows and columns by executing the SQL logic within the database to reduce data transfer time for faster overall performance. See Enable Custom SQL Query.
- Users can now import Hive views to be used as a source. See Hive Browser.
- Expand the list of file extensions that are permitted for upload. See Miscellaneous Configuration.
Compilation/Execution:
New Spark v2.1.0-based running environment leverages in-memory speed to deliver overall faster execution times on jobs. See Configure Spark Running Environment.
NOTE: As of Release 4.0, for new installs and upgrades, Spark is the default running environment for execution on the Hadoop cluster. Support for Hadoop Pig running environment is deprecated and in future releases will reach end-of-life. For more information, see Running Environment Options.
NOTE: Python UDFs are not supported in the Spark running environment. Support for Python UDFs is deprecated and in a future release will reach end-of-life. For more information on migrating to using Java UDFs, see Changes to the User-Defined Functions.
- You can disable the ability to run jobs on the Alteryx node. See Running Environment Options.
- User-specific properties can be passed to Pig or Spark for use during job execution. See Configure User-Specific Props for Cluster Jobs.
Default file publishing setting for CSV output is multiple output files when using a Hadoop running environment, resulting in better performance over large data volumes.
Language:
- Window transform now supports use of aggregation functions. See Window Transform.
- New
NOW
andTODAY
functions.- See NOW Function.
- See TODAY Function.
- New
ROLLINGSUM
function computes the rolling sum over a specified number of rows before and after the current row. See ROLLINGSUM Function. - New
ROLLINGAVERAGE
function computes rolling average over a specified window. See ROLLINGAVERAGE Function. - New
ROWNUMBER
function computes the row number for each row, based on order and optional grouping parameters. See ROWNUMBER Function. - New
COUNTA
function can be used to count the number of non-null values in a column based on order and grouping parameters. See COUNTA Function. - New
COUNTDISTINCT
function counts distinct number of values in a specified column. See COUNTDISTINCT Function. - Four new functions for testing conditional data validation:
IFNULL
,IFMISMATCHED
,IFMISSING,
andIFVALID
. See Type Functions. - New
*IF
functions for each available aggregation function. See Aggregate Functions. - For more information, see Changes to the Language.
APIs:
- First release of publicly available APIs, which enable end-to-end operationalization of processing your datasets. See API Reference.
CLI:
- Add custom properties to your jobs when executing via CLI on the Hadoop cluster (i.e. YARN queue) See Configure User-Specific Props for Cluster Jobs.
Admin, Install, & Config:
- Support for HDP 2.5. See Supported Deployment Scenarios for Hortonworks.
- Support for non-default users and groups. See Required Users and Groups.
- New Admin Settings page exposes all platform configuration that is available through the application for easy search, updating, and validation. See Changes to the Admin Settings Page.
- Configurable log levels for key platform services. See Configure Logging for Services.
- Pre-upgrade samples are now persisted after upgrade is complete.
- Alteryx administrators can download services logs through the application, instead of the Alteryx node. See System Services and Logs.
Changes in System Behavior
Changes to the Language:
set
andsettype
transforms now work on multiple columns.- Recipe steps are now displayed in natural language format by default in the recipe panel and suggestion cards.
- Some functions have been renamed to conform to common function names.
- For more information, see Changes to the Language.
Changes to the CLI:
- The Jobs command line interface now supports job execution on the Spark running environment. See CLI for Jobs.
End of Life Features:
- The Javascript running environment and profiler are no longer supported. Use the Trifacta Photon running environment instead. For more information, see Running Environment Options.
- The Hadoop Pig profiler and the Python-based Spark profiler are no longer supported. Use the Scala profiler instead. See Profiling Options.
- The
/docs
for inline documentation is no longer supported. Content in that location has been replaced and superseded by content in product documentation.- See Command Line Interface.
- See Wrangle Language.
- See Text Matching.
- For more information, see End of Life and Deprecated Features.
Key Bug Fixes
Ticket | Description |
---|---|
TD-21006 | The Trifacta Photon running environment fails to compress output file and is forced to restart on download. |
TD-20736 | Publish to Redshift fails for single-file outputs. |
TD-20524 | Join tool hangs due to mismatched data types. |
TD-20344 | When the Trifacta Photon client is enabled, no sample data is displayed when joins yield a data mismatch. |
TD-20176 | After Release 3.2.1 upgrade, data grid in the Transformer Page no longer displays any data in the sample, even though data is present in the pre-upgrade environment. |
TD-20173 | NUMFORMAT string #.#0 fails to be converted to supported string format on upgrade, and recipe step fails validation. For more information, see Changes to the Language. |
TD-19899 | Failed first job of jobgroup prevents datasets from showing up in flow. |
TD-19852 | User can accept compressed formats for append publish action. |
TD-19678 | Column browser does not recognize when you place a checkmark next to the last column in the list. |
TD-18836 | find function accepts negative values for the start index. These values are consumed but produce unexpected results. |
TD-18746 | When the Trifacta Photon client is enabled, previews in the data grid may take up to 30 seconds to dismiss. |
TD-18538 | Platform fails to start if Alteryx user for S3 access does not have the ListAllMyBuckets permission. |
TD-18340 | When writing CSV outputs, the Spark running environment fails to recognize the defined escape character. |
TD-17677 | Remove references to Zookeeper in the platform. |
TD-16419 | Comparison functions added through Builder are changed to operators in recipe |
TD-12283 | Platform cannot execute jobs on Pig that are sourced from S3, if OpenJDK is installed. |
New Known Issues
Ticket | Component | Description |
---|---|---|
TD-22128 | Complication/Execution | Cannot read multi-file Avro stream if data is greater than 500 KB. Workaround: Load files as independent datasets and union them together, or concatenate the files outside of the platform. |
TD-21737 | Transformer Page | Cannot transform downstream datasets if an upstream dataset fails to contain a Workaround: Add a |
TD-20796 | Job Results Page | For date column, Spark profiling shows incorrect set of dates when source data has a single date in it. |
TD-19183 | Workspace | Merge function does not work with double-escaped values, and job fails in Pig. Example: set col: column4 value: merge(['ms\\',column4]) Workaround: Add a dummy character to the original transform and then remove it. Example: set col: column4 value: merge(['ms\\℗',column4]) replace col: column4 on: '℗' with: '' As another alternative, you can execute the job in the Spark running environment. |
This page has no comments.