Page tree

Release 9.2


Contents:

   

Contents:


This section provides information on various settings that you can specify to apply minimum and maximum limits on the  Designer Cloud® application .

Operating System Limits

Raise ulimit setting

To perform normal operations, the  Designer Cloud powered by Trifacta platform  may need to maintain a high number of simultaneously open files, the count of which may exceed the default setting for the operating system (the ulimit). 

NOTE: If the Designer Cloud powered by Trifacta platform hits the ulimit and is unable to open additional files, jobs may fail, or the platform may be unable to access content. The log may contain something similar to the following error: Failed on local exception: java.net.SocketException: Too many open files.


By default, the operating system sets the limit on the number of open files at 1024. Please complete the following steps to raise this limit. 

Tip: The ulimit should be raised to 64000 depending on the quality of your hardware.

 

Steps:

  1. If it is running, stop the  Designer Cloud powered by Trifacta platform . See Start and Stop the Platform.
  2. Verify the current ulimit:

    ulimit -Hn
  3. Edit the following file: /etc/security/limits.conf.
  4. At the bottom of the file, add the following entry, which overrides the defined limit with a value of 16000:

    *   hard    nofile  16000
  5. Please add the following line after the previous one if this error is encountered: "java.lang.OutOfMemoryError: unable to create new native thread". This exception means the ulimit for processes must be increased, too:

    *   hard    nproc   16000
  6. Save the file and restart the platform. See Start and Stop the Platform.

Browser Limits

Change body limits

If you are encountering log message where the request submitted from the client is too large, you can try to raise the limit on the size of body objects submitted from the client.

NOTE: Raising these values too high can overload the browser.

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

SettingDescription
"webapp.bodyParser.urlEncoded.limit": "10mb",
Maximum permitted size of the URL-encoded body of a request submitted from the client. Size is in MB.
"webapp.bodyParser.json.limit": "10mb",
Maximum permitted size of a JSON object submitted from the client. Size is in MB.

Change maximum number of rows displayed in browser per join key

For each matching join key value, the  Designer Cloud application  displays a maximum of three rows in the browser for the current sample. So, when you join a dataset with repeating key values, you may see a fewer number of rows of data than you would expect.

NOTE: This issue is limited only to the sampled data that is displayed in the browser. When you run a job across the entire dataset, the proper number of rows are generated in the output.

For some users, this simplification may be confusing. As needed, you can use the following steps to change the maximum number of rows displayed in the browser for each join key.

Steps:

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

  1. Search for and modify the following parameter:

    "webapp.client.sampleOutputTuplesPerJoinKey": 3,
  2. Save your changes and restart the platform.

Change page preview limit

In the Flow and Dataset pages, you can preview the data in datasets that you have imported or are importing. For example, when you click the Eye icon next to a dataset's name, you can see a preview of the data in the dataset, which is useful for ensuring that you have the correct data. 

Depending on the size of the datasets, you may wish to increase the limit on the size of preview data. If you are working with wide datasets, you may need to increase the limit so that you can get a solid preview of the contents. 

NOTE: Increasing this preview size may have performance impacts, particularly on lower-quality desktops. You should make adjustments with caution.

Steps:

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

  1. Locate the following setting, which defines the number of bytes that are loaded by default in a preview. Maximum permitted value is 1024000 (1 MB).

    "webapp.client.previewLoadLimit": 128000,
  2. Save your changes and restart the platform.
  3. After the platform has restarted, you should preview a large dataset to verify that performance is acceptable.

Ingestion

Maximum record length

By default, the maximum length for an individual record is 20 MB. After rows have been split, individual records can be up to this limit in length. 

As needed, you can modify this limit.

Steps:

  1. You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.
  2. Locate the following parameter, which reflects the maximum record length in bytes:

    "webapp.maxRecordLength": 209715200,
  3. Modify as needed.

    NOTE: Be careful when you raise this value, which can cause out of memory conditions, empty data grids, and browser crashes. You should raise the value incrementally.

  4. Save your changes and restart the platform.

Timeouts

Change application timeout limits

The front-end application respects the following timeout settings for queries issued to back-end datastores, including the Trifacta database

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

SettingsDescription
webapp.timeoutMillisecondsOverall timeout limit in milliseconds for the front-end application. Default value is 120000 (2 minutes).
jsdata.remoteTransformTimeoutMillisecondsTimeout limit in milliseconds for the Transformer Page. This setting is an override of the previous one. Default value is 180000 (3 minutes).

You can change the timeout settings if you are experiencing timeouts or other errors because of long-running queries to external data connections.

NOTE: In most environments, these settings should not be changed. Lowering them can cause reasonable queries to fail, and raising them too high can cause performance issues. Please adjust them only if you are experiencing very long query times to external sources, especially for database views.

Steps:

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

  1. Locate the following configuration. Specify new timeout values in milliseconds:

    "webapp.timeoutMilliseconds": 120000,
    "jsdata.remoteTransformTimeoutMilliseconds": 180000,
  2. Save your changes and restart the platform.

Session timeout

By default, the maximum session duration is set to be one month. If needed, you can change the maximum session duration, as well as other session parameter values.

Steps:

  1. You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

  2. Modify the following parameters, as needed:

    ParameterDescriptionDefault
    webapp.session.refreshEmbeddedExpiryDateAfterMinutes
    Refresh interval in minutes for the expiration date embedded in the session cookie5
    webapp.session.cookieSecureFlag Set a secure cookie in the client application.

    false

  3. You apply this change through the Workspace Settings Page. For more information, see Platform Configuration Methods.:

    SettingDescriptionDefault
    Session durationMaximum session duration in minutes10080 (one week)
  4. Save your changes and restart the platform.

Timeout for suggestion card suggestions

By default, the platform waits a specified length of time for the machine learning service to return suggestion cards. When more time is enabled, the service may be able to discover better suggestions based on the currently selected data.

If needed, you can change the delay limit from its default value of 80 milliseconds.

Steps:

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

  1. Locate the following setting and change its value:

    "feature.mlTransformSuggestions.delayThreshold": 80,
  2. Save your changes and restart the platform.

Jobs

Maximum number of flow jobs launched in parallel

By default, the  Designer Cloud powered by Trifacta platform  permits up to four jobs from the same flow to be launched in parallel for execution. If there are more flow job launches than this limit, the additional jobs are queued for execution after one or more of the launched jobs has completed. 

Tip: This limit is most relevant when you are running a scheduled job, which can execute all jobs in a flow at the same time.


Max parallel jobs settingDescription
4

(Default) Up to four jobs from the same flow can be launched and in the process of execution at the same time.

  • Additional jobs are queued for execution.
1Jobs from the flow are executed sequentially.
0No limit. All jobs from a flow can be executed at the same time.

Steps:

  1. You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.
  2. Locate the following parameter. Modify it according to your needs:

    "webapp.jobLaunchingBatchSize": 4,
  3. Save your changes and restart the platform.

Job status polling interval

Periodically, the application polls the running environment to check the status of jobs in transit. This polling occurs in the following areas of the application:

  • Jobs page - Checks to see if running jobs have been resolved.
  • Flow View page - Checks to see if running jobs have been resolved.
  • Transformer page - Checks to see if sampling jobs have been resolved.

    NOTE: This setting does not apply to the initial sample which is derived from the first N rows of the dataset.

As needed, you can modify the interval at which the application polls for job status from these area. The default value is 5000 milliseconds (5 seconds).

NOTE: If this setting is lowered too much, polling requests can overlap, resulting in no updates to the application. Application performance can be impeded.

Steps:

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

  1. Locate the following setting and change its value:

    "webapp.polling.jobStatusInMillis" : 5000,
  2. Save your changes and restart the platform.

Sampling

The following configuration settings define the size of samples stored in the base storage layer and transmitted to the user's web browser for display through the Transformer page.

Size of stored samples

By default, samples are generated and stored in the base storage layer up to 40 MB in size. 

NOTE: This size is applied to all user-generated samples. Modifications to this size can significantly change the volume of data stored in the backend.

NOTE: If the datasource is compressed or must be converted during ingestion, the stored size of the sample on the base storage layer can exceed this limit.

Tip: This size should be modified in conjunction with any changes to the maximum size of transferred samples, which is described in the following section.

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

SettingDefault valueDescription
webapp.sampleOutputLimit41943040

Sets the requested size of samples in bytes that are stored in the base storage layer for each sample.

NOTE: This parameter defines the storage size of samples on the backend. Default storage size is four times larger than in previous releases.


NOTE: For datasources that must be decompressed or converted during ingest, the actual storage volume may be larger than this limit.

Sample size load limit

By default, samples that are transferred to the client in the web browser for users are 10 MB in maximum size. If desired, users can increase or decrease this sample size on a per-recipe basis.

As needed, you can configure the following: 

  • setting the default size of samples displayed in the browser (default is 10 MB)
  • setting the maximum size of samples displayed in browser (default is 40 MB)
    • users can override the actual size of the sample downloaded to their browser based on their own experience

Notes:

Increasing the sample size may degrade the user experience in the Transformer page in the following ways:

  • Generation of column details and data grid histograms
  • Preview card loading time
  • Time required to complete brushing and linking in histograms

    NOTE: If you increase the sample size above the default setting and encounter unacceptable performance in the above areas, you should reduce the sample size settings.

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

SettingDefault valueDescription and Notes
webapp.client.defaultLoadLimit10485760

Sets the default maximum number of bytes that can be loaded into the browser for samples.

NOTE: On a per-recipe basis, users can override this setting through the Transformer page. See Change Recipe Sample Size.

webapp.client.maxLoadLimit41943040

Sets the maximum number of bytes that can be loaded into the browser for samples.

NOTE: This value cannot be overridden by users. Users can set the sample size in their browser up to this limit and no higher.

Photon random sample load limit

Unless it is not available for some reason, Trifacta Photon is used to generate random samples. By default, the Trifacta Photon running environment loads a maximum of 1 GB (1024 MB) of data from the imported dataset for generating a new random sample. This data comes from the top of the file, meaning that rows that are deeper than 1 GB in the source data cannot be included in any generated random sample.

From this selection of data, a sample of the data is derived for display in the data grid. As needed, you can configure the random sample limit to include a larger or smaller volume of maximum data.

NOTE: Be careful making adjustments to this setting. If the volume of data is too large, you can crash the running environment.

Steps:

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

  1. Locate the following setting, which is listed in terms of bytes. The default value listed below corresponds to 1 GB of data:

    "webapp.sampleLoadLimit": 1073741824,
  2. Save your changes and restart the platform. 

Wrangling Limits

Maximum split limits

By default, a single split operation can break up a single column into 250 separate columns. As needed, you can change this maximum value. 

NOTE: Increasing this limit consumes more resources and may overload the Designer Cloud application or your browser. Adjust with caution.

Tip: This limit also to extraction of keys and values from Objects and Arrays.

Steps:

You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.

  1. Locate the following setting, which represents the maximum number of columns that can be generated by one of the applicable steps:

    "feature.delimSplitColumnLimit": 250,
  2. Save your changes and restart the platform. 

Relational limits

See Configure Security for Relational Connections.

Miscellaneous limits

Date range limit

By default, the  Designer Cloud powered by Trifacta platform  supports the following date range for Datetime data type validation: 

January 1, 1400 - December 31, 2599

This date range is validated against the following default regular expression:

((?:1[4-9]|2[0-5])\d{2})

As needed, you can change the above regular expression to define your preferred date range for the Datetime data type. Your regular expression must be in the following format:

(<your_regular_expression>)

For example, the following regular expression allows dates up to December 31, 9999:

((?:1[4-9]|[0-9][0-9])\d{2})


NOTE: Use of Trifacta patterns in this field is not supported. The entry must be a valid regular expression.

Steps:

  1. You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For more information, see Platform Configuration Methods.
  2. Locate the following parameter:

    webapp.yearFourDigitRegex
  3. Insert your regular expression in the required format. 
  4. Save your changes and restart the platform.
  5. You should check your new Datetime date range validation against some sample data. 

This page has no comments.