Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

...

Available Environments

d-s-serverphoton
 Running Environment

This running environment is available through the

d-s-item
item
node
. When enabled, select Run on
D s server
Photon.

Info

NOTE: This running environment is enabled by default.

...

Required Configuration: See Configure Trifacta Server Photon Running Environment.

Supported Output Formats: CSV, JSON, Avro, Parquet

...

Info

NOTE: When a recipe containing a user-defined function is applied to text data, any null characters cause records to be truncated during

d-s-serverphoton
job execution. In these cases, please execute the job on HadoopSpark.

Spark Running Environment

This running environment is the new default running environment. The Spark running environment deploys Spark libraries from the 

d-s-item
item
node
 to the nodes of the Hadoop integrated cluster. Spark uses in-memory processing for jobs, which limits the read/write operations on each node's hard storage and thereby shortens the time to execute jobs.

...

If you have deployed the 

D s platform
 to integrate with an Amazon EMR cluster, you can run Spark-based jobs on the cluster. This environment is similar to the Hadoop clusterSpark running environment

Required Installation: None.

...

This running environment deploys Spark libraries from the 

d-s-item
item
node
 to the nodes of the Azure Databricks cluster. Spark uses in-memory processing for jobs, which limits the read/write operations on each node's hard storage and thereby shortens the time to execute jobs.

...

  • The Spark running environment requires a Hadoop cluster as the backend job execution environment.
    • In the Run Job page, select Run on HadoopSpark.
  • The
    d-s-serverphoton
     running environment executes on the 
    d-s-item
    item
    node
     and provide processing to the front-end client and at execution time. 
    • In the Run Job page, select Run on 

      D s server
      .
      D s server
       is enabled by default. Photon.

    • For more information on disabling the

      d-s-serverphoton
       running environment running environment, see Configure Trifacta Server Photon Running Environment.

TypeRunning EnvironmentConfiguration ParametersNotes
Hadoop BackendSpark

webapp.runInHadoop = true

The Spark running environment is the default configuration. 
Client Front-end and non-Hadoop Backend

d-s-serverphoton

webapp.runInTrifactaServer = true

photon.enabled = true

d-s-serverphoton
is the default running environment for the front-end of the application. It is enabled by default.

...

Info

NOTE: If your environment has no running environment such as Hadoop Spark for running large-scale jobs, this parameter is not used. All jobs are run on the 

d-s-servernode
.

Code Block
"webapp.client.maxExecutionBytes.photon": 1000000000,

...

Running EnvironmentDefault Condition

d-s-serverphoton

Size of primary datasource is less than or equal to the above value in bytes.
Cluster-based running environmentSparkSize of primary datasource is greater than the above value in bytes.

...

Warning

Setting this value too high forces more jobs onto the

d-s-serverphoton
running environment, which may cause slow performance and can potentially overwhelm the server.

Tip

Tip: To force the default setting to always be a Hadoop or bulk running environment, set this value to 0. All users are recommended to use the bulk option instead of the

d-s-serverphoton
running environment. However, smaller jobs may take longer than expected to execute.