Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version r0682

D toc

The

D s webapp
rtrue
can connect to a high-performance environment embedded in the
D s node
 for execution of jobs against small- to medium-sized datasets, called the
D s photon
 running environment
.

  • The  
    D s photon
     running environment can be selected in the Run Jobs page.

By default, the 

D s photon
 running environment is enabled for new installations.

Features:

  • Faster execution times for transform and profiling jobs
  • Better consistency with typecasting done in Spark jobs

This section provides information on how to enable and configure the

D s photon
 running environment.

Info

NOTE: Some configuration is shared with the

D s photon
clienttrue
client. For more information, see Configure Photon Client.


Limitations

Info

NOTE: For profiles executed in the

D s photon
running environment, percentages for valid, missing, or mismatched column values may not add up to 100% due to rounding. See Overview of Visual Profiling.

Disable 
D s photon
 Running Environment

The 

D s photon
 running environment is enabled by default. Please complete the following configuration to disable the running environment.

Info

NOTE: A cluster-based running environment, such as Spark, must be available for processing jobs when this one is disabled.


Steps:

  1. D s config
  2. To disable the 

    D s photon
     running environment, apply the following configuration settings:

    Code Block
    "webapp.runInTrifactaServer": false,
    "feature.enableSamplingScanOptions": false,
    "feature.enableFirstRowsSample": false, 
  3. Save your changes and restart the platform.

Example Configuration

The following configuration includes the default values.

 

Code Block
"photon": {
  "cacheEnabled": true,
  "numThreads": 4,
  "enabled": true,
  "distroPath": "/photon/dist/centos6/photon",
  "traceExecution": false,
  "websocket": {
    "host": "localhost",
    "port": 8082
  },
  "mode": "wasm"
},

Some of these values apply to the

D s photon
clienttrue
 client. For more information, see Configure Photon Client.

ParameterDescription
cacheEnabledDebugging setting. Leave the default value.
numThreads

Maximum number of threads permitted to the

D s photon
process. For recommended values, see Configure Photon Client.

enabled

This value should be set to true. For more information, see Configure Photon Client.

distroPath

Please verify that this property is set to the following value, which works for all operating system distributions:

Code Block
  "distroPath": "/photon/dist/centos6/photon",
traceExecutionDebugging setting. Leave the default value.
websocket.hostInternal parameter. Do not modify.
websocket.portInternal parameter. Do not modify.
mode

Set this value is wasm. For more information, see Configure Desktops.

Modify Limits

Runtime job timeout

By default, the 

D s platform
 imposes no limit on execution of a 
D s photon
 job. As needed, you can enable and configure a limit.

Steps:

  1. D s config

    Code Block
    "photon.timeoutEnabled": false,
    "photon.timeoutMinutes": 180,
    SettingDescription
    timeoutEnabledSet to false to disable job limiting. Set to true to enable the timeout specified below.
    timeoutMinutes

    Defines the number of minutes that a

    D s photon
    job is permitted to run. Default value is 180 (three hours).

  2. Save your changes and restart the platform.

When a job has failed due to exceeding a timeout, additional information is available in the job logs. The following is a good search term for this type of error:

Code Block
java.lang.Exception: Photon job '<jobId>' timeout

where <jobId> is the internal job identifier.

Job logs can be downloaded from the Job page. See Jobs Page.

D s photon
 running environment memory timeout

To prevent crashes, the 

D s photon
 running environment imposes a memory consumption limit for each job. If this memory timeout is exceeded, the job is automatically killed. As needed, you can disable this memory protection (not recommended) or change the memory threshold when jobs are killed.

Steps:

  1. D s config

  2. Locate the following settings:

    Code Block
    "photon.memoryMonitorEnabled": false,
    "photon.memoryPercentageThresold": 60,
    SettingDescription
    memoryMonitorEnabledSet to false to disable memory monitoring. Set to true to enable the threshold specified below.
    memoryPercentageThreshold

    Defines the percentage of total available system memory that a

    D s photon
    job process is permitted to consume. Default value is 60 (60%).

    Tip

    Tip: This threshold applies to individual jobs. If this threshold value is over 50%, it is possible for two concurrent

    D s photon
    jobs to use more than the available memory, crash the server, and force a restart. You may wish to start by setting threshold values at a lower level.

  3. Save your changes and restart the platform.

When a job has failed due to exceeding this memory threshold, additional information is available in the job logs. The following is a good search term for this type of error:

Code Block
java.lang.Exception: Photon job '<jobId>' failed with memory consumption over threshold

where <jobId> is the internal job identifier.

Below this line item, you may see the following entries, which can provide additional information to adjust the memory settings:

Code Block
2017-05-04T02:26:40.549Z [job-id 740] com.trifacta.joblaunch.util.ProcessMonitorUtil [Thread-20] INFO  com.trifacta.joblaunch.util.ProcessMonitorUtil - Global memory size: 8373186560 bytes
2017-05-04T02:26:40.555Z [job-id 740] com.trifacta.joblaunch.util.ProcessMonitorUtil [Thread-20] INFO  com.trifacta.joblaunch.util.ProcessMonitorUtil - Available global memory size at process start: 2672959488 bytes
...
2017-05-04T02:29:15.690Z [job-id 740] com.trifacta.joblaunch.util.ProcessMonitorUtil [Thread-20] INFO  com.trifacta.joblaunch.util.ProcessMonitorUtil - Current memory consumption: 5.614080429077148%
2017-05-04T02:29:15.691Z [job-id 740] com.trifacta.joblaunch.util.ProcessMonitorUtil [Thread-20] ERROR com.trifacta.joblaunch.util.ProcessMonitorUtil - Average memory consumption for the past 15 seconds over 5% threshold: 5.174326801300049 %. Current available global memory: 2244628480 bytes
ItemDescription
Global memory sizeTotal available global memory in bytes
Available global memory size at process startTotal available memory in bytes when the job is launched
Current memory consumption
Current memory usage for the job process as a percentage of the total. This metric is posted to the log every 30 seconds and can be used to debug memory leaks.
Average memory consumption for the past 15 seconds over x% threshold

When the job fails due to the memory threshold, this metric identifies the average memory consumption percentage over the past 15 seconds.

The defined threshold percentage is included.

Current available global memoryWhen the job fails, this metric identifies the total available memory at the time of failure.

Job logs can be downloaded from the Job page. See Jobs Page.

Batch FileSystem Access Timeout Settings

The default timeout settings for reading and writing of data from the client browser through 

D s photon
 running environment to the 
D s node
 should work in most cases.

Particularly when reading from large tables, you might discover errors similar to the following:

Code Block
06:21:21.365 [Job 23] INFO com.trifacta.hadoopdata.photon.BatchPhotonRunner - terminating with uncaught exception of type Poco::TimeoutException: Timeout
06:21:21.375 [Job 23] INFO com.trifacta.hadoopdata.photon.BatchPhotonRunner - /vagrant/photon/dist/centos6/photon/bin/photon-cli: line 22: 15639 Aborted $
Unknown macro: {command[@]}

Steps:

  1. D s config
  2. Locate the photon.extraCliArgs node.

  3. Add the following values to the extraCliArgs entry:

    Code Block
    "photon.extraCliArgs" : "-batch_vfs_read_timeout <300> -batch_vfs_write_timeout <300>"
    ArgumentDescription
    -batch_vfs_read_timeout

    Timeout limit in seconds of read operations from the datastore. Default value is 300 seconds (5 minutes).

    Tip

    Tip: Raising the value to 3600 seconds should be fine in most environments. Avoid setting this value above 7200 seconds (2 hours).

    -batch_vfs_write_timeout

    Timeout limit in seconds of write operations to the datastore. Default value is 300 seconds (5 minutes).

    Info

    NOTE: Do not modify unless specifically instructed by

    D s support
    .

  4. To reduce timeouts, raise the above settings.
  5. Save your changes and restart the platform.

Tuning Photon

For more information on tuning the performance of Photon, see Tune Application Performance.

Configure VFS Service

The

D s photon
 running environment interacts with backend datastores through the VFS service.

Info

NOTE: The VFS service does not often need non-default configuration.


For more information, see Configure VFS Service.

Use 
D s photon
 Running Environment

When executing a job, select the Photon option.

Info

NOTE: Before you test, please be sure to complete all steps of Required Platform Configuration.