The  can connect to a high-performance environment embedded in the  for execution of jobs against small- to medium-sized datasets. This environment, called the Photon running environment, also supports working with much larger samples in the Transformer page, which enables you to build your recipes against dataset snapshots of higher fidelity. 

By default, the Photon running environment is enabled for new installations. Additional configuration is required.

Features:

This section provides information on how to enable and configure the Photon running environment. 

Known Limitations and Issues

NOTE: For profiles executed in the Photon running environment, percentages for valid, missing, or mismatched column values may not add up to 100% due to rounding.

Desktop Requirements

NOTE: To interact with the Photon running environment, all desktop instances of Google Chrome must have the PNaCl component enabled and updated to the minimum supported version. See Desktop Requirements.

Example Configuration

The following are the default values for Photon enablement:

"photon": {
  "cacheEnabled": true,
  "numThreads": 4,
  "enabled": true,
  "distroPath": "/photon/dist/centos6/photon",
  "loadScalingFactor": 20,
  "traceExecution": false,
  "websocket": {
    "host": "localhost",
    "port": 8082
  },
  "mode": "pnacl"
},
ParameterDescription
cacheEnabledDebugging setting. Leave the default value.
numThreads

Maximum number of threads permitted to the Photon process. See below for recommended values.

enabled

Verifiy that this value to true. This parameter must be properly set to enable Photon. See below.

distroPath

Please verify that this property is set to the following value, which works for all operating system distributions:

  "distroPath": "/photon/dist/centos6/photon",
loadScalingFactor
Used in conjunction with other parameters to define the maximum size of samples for Photon. For more information, see Sample Size below. 
traceExecutionDebugging setting. Leave the default value.
websocket.hostHostname of the web socket service. Leave this value as localhost
websocket.portPort number of the web socket service. Default value is 8082. 
mode

Set this value is pnacl.

This parameter must be properly set to enable Photon. See below.

Recommended Photon Configuration by Core Count

On the , you can make adjustments to the resources claimed by the Photon running environment based on the number of cores on the machine. The following table identifies the recommended settings for a node with 8, 16, or 32 cores. The default settings assume 16 cores.

Parameter8 cores16 cores (default)32 cores
webapp.numProcesses225
vfs-service.numProcesses223
photon.numThreads244
batchserver.workers.photon.max224

The number of simultaneous users is a competing factor.

The following table illustrates some adjustments for a 16-core system:

Parameter16 cores (default)Low number of simultaneous usersHigh number of simultaneous users
webapp.numProcesses
214
vfs-service.numProcesses
214
photon.numThreads
444
batchserver.workers.photon.max
222

Disable

This running environment is enabled by default. Please complete the following configuration to disable the running environment.

NOTE: A cluster-based running environment, such as Spark, must be available for processing jobs when this one is disabled.


Steps:

  1. To disable the , apply the following configuration settings:

    "webapp.runInTrifactaServer": false,
    "feature.enableSamplingScanOptions": false,
    "feature.enableFirstRowsSample": false, 
  2. Do not change the following, which applies to the Photon web-client:

    "photon.enabled": true,
  3. Save your changes and restart the platform.

Change Limits

NOTE: Increasing these values can have a significant impact on load times and performance. Change these values only if you are experiencing difficulties. Make incremental changes.

Sample Size Limits

Increasing the sample size may degrade the user experience in the Transformer page in the following ways:

When samples are created using the Photon running environment, their maximum size is determined by multiplying the values for the following settings. Default value is 10 MB.

Settingwebapp.client.loadLimitphoton.loadScalingFactorTotal
Default Value5120002010240000

Maximum Data in the Client

The following settings determine the maximum amount of data that is permitted to be passed to the client from Photon:

Settingwebapp.client.maxResultsBytesphoton.loadScalingFactorTotal
Default Value20971522041943040

Timeouts

Photon runtime job timeout

By default, the  imposes no limit on execution of a Photon job. As needed, you can enable and configure a limit. 

Steps:

  1. "photon.timeoutEnabled": false,
    "photon.timeoutMinutes": 180,
    SettingDescription
    timeoutEnabledSet to false to disable job limiting. Set to true to enable the timeout specified below.
    timeoutMinutesDefines the number of minutes that a Photon job is permitted to run. Default value is 180 (three hours).
  2. Save your changes and restart the platform.

When a job has failed due to exceeding a timeout, additional information is available in the job logs. The following is a good search term for this type of error:

java.lang.Exception: Photon job '<jobId>' timeout

where <jobId> is the internal job identifier.

Job logs can be downloaded from the Job page. See Jobs Page.

Photon memory timeout

To prevent crashes of the server, Photon imposes a memory consumption limit for each job. If this memory timeout is exceeded, the job is automatically killed. As needed, you can disable this memory protection (not recommended) or change the memory threshold when jobs are killed.  

Steps:

  1. Locate the following settings:

    "photon.memoryMonitorEnabled": false,
    "photon.memoryPercentageThresold": 60,
    SettingDescription
    memoryMonitorEnabledSet to false to disable memory monitoring. Set to true to enable the threshold specified below.
    memoryPercentageThreshold

    Defines the percentage of total available system memory that a Photon job process is permitted to consume. Default value is 60 (60%).

    Tip: This threshold applies to individual Photon jobs. If this threshold value is over 50%, it is possible for two concurrent Photon jobs to use more than the available memory, crash the server, and force a restart. You may wish to start by setting threshold values at a lower level.

  2. Save your changes and restart the platform.

When a job has failed due to exceeding this memory threshold, additional information is available in the job logs. The following is a good search term for this type of error:

java.lang.Exception: Photon job '<jobId>' failed with memory consumption over threshold

where <jobId> is the internal job identifier.

Below this line item, you may see the following entries, which can provide additional information to adjust the memory settings:

2017-05-04T02:26:40.549Z [job-id 740] com.trifacta.joblaunch.util.ProcessMonitorUtil [Thread-20] INFO  com.trifacta.joblaunch.util.ProcessMonitorUtil - Global memory size: 8373186560 bytes
2017-05-04T02:26:40.555Z [job-id 740] com.trifacta.joblaunch.util.ProcessMonitorUtil [Thread-20] INFO  com.trifacta.joblaunch.util.ProcessMonitorUtil - Available global memory size at process start: 2672959488 bytes
...
2017-05-04T02:29:15.690Z [job-id 740] com.trifacta.joblaunch.util.ProcessMonitorUtil [Thread-20] INFO  com.trifacta.joblaunch.util.ProcessMonitorUtil - Current memory consumption: 5.614080429077148%
2017-05-04T02:29:15.691Z [job-id 740] com.trifacta.joblaunch.util.ProcessMonitorUtil [Thread-20] ERROR com.trifacta.joblaunch.util.ProcessMonitorUtil - Average memory consumption for the past 15 seconds over 5% threshold: 5.174326801300049 %. Current available global memory: 2244628480 bytes
ItemDescription
Global memory sizeTotal available global memory in bytes
Available global memory size at process startTotal available memory in bytes when the job is launched
Current memory consumption
Current memory usage for the job process as a percentage of the total. This metric is posted to the log every 30 seconds and can be used to debug memory leaks.
Average memory consumption for the past 15 seconds over x% threshold

When the job fails due to the memory threshold, this metric identifies the average memory consumption percentage over the past 15 seconds.

The defined threshold percentage is included.

Current available global memoryWhen the job fails, this metric identifies the total available memory at the time of failure.

Job logs can be downloaded from the Job page. See Jobs Page.

Batch FileSystem Access Timeout Settings

The default timeout settings for reading and writing of data from the client browser through Photon should work in most cases. 

Particularly when reading from large tables, you might discover errors similar to the following:

06:21:21.365 [Job 23] INFO com.trifacta.hadoopdata.photon.BatchPhotonRunner - terminating with uncaught exception of type Poco::TimeoutException: Timeout
06:21:21.375 [Job 23] INFO com.trifacta.hadoopdata.photon.BatchPhotonRunner - /vagrant/photon/dist/centos6/photon/bin/photon-cli: line 22: 15639 Aborted $
Unknown macro: {command[@]}

Steps:

  1. Locate the photon.extraCliArgs node.

  2. Add the following values to the extraCliArgs entry:

    "photon.extraCliArgs" : "-batch_vfs_read_timeout <300> -batch_vfs_write_timeout <300>"
    ArgumentDescription
    -batch_vfs_read_timeout

    Timeout limit in seconds of read operations from the datastore. Default value is 300 seconds (5 minutes).

    Tip: Raising the value to 3600 seconds should be fine in most environments. Avoid setting this value above 7200 seconds (2 hours).

    -batch_vfs_write_timeout

    Timeout limit in seconds of write operations to the datastore. Default value is 300 seconds (5 minutes).

    NOTE: Do not modify unless specifically instructed by .

  3. To reduce timeouts, raise the above settings.
  4. Save your changes and restart the platform. 

Configure VFS Service

The VFS Service serves the front-end interface and brokers connections with the backend datastores when the Photon running environment is enabled.

NOTE: The VFS service must be enabled when Photon is enabled.

Steps:

  1. Locate the following configuration: 

    "vfs-service.port":41913,
    "vfs-service.loggerOptions.silent":false,
    "vfs-service.loggerOptions.level":"info",
    "vfs-service.loggerOptions.json":false,
    "vfs-service.loggerOptions.format":":method :url :status :res[content-length] :response-time :referrer :remote-addr :trifacta-user :user-agent",
    "vfs-service.host":"localhost",
    "vfs-service.enabled":true,
    "vfs-service.bindHost":"0.0.0.0",
    "vfs-service.autoRestart":true,
  2. Verify that the enabled parameter is set to true.
  3. Additional configuration settings are described below. 
  4. Save your changes and restart the platform.
ParameterDescription
port

Port number that VFS service uses to communicate. Default value is 41913. This port must be opened on the .

loggerOptions.silentWhen set to true, messages are suppressed in the user interface.
loggerOptions.level

Supported logging levels: info (default), warning, error, and debug.

NOTE: debug logging level is very verbose.

loggerOptions.json

When set to true, log messages are written in JSON format.

loggerOptions.formatIf needed, you can re-order the fields that are included in each log message.
hostHost of the VFS Service. Leave this value as localhost.
enabledSet this value to true to enable the VFS Service.
bindHostDo not modify this value.
autoRestart

When set to true, the VFS Service automatically restarts and attempts to return to its pre-restart state.

This value should be set to false for debugging purposes only.

Use Photon

When Photon is enabled, it is available like any other running environment in the application. When executing a job, select the Run on  option from the drop-down in the Run Job dialog.

NOTE: Before you test, please be sure to complete all steps of Required Platform Configuration.