- If you are working with compressed or binary formats, you should use the expanded sizes for your data volume estimates.
- Some workloads are more compute- or memory-intensive and may increase the required number of nodes or capabilities of each node. These include:
- Scripts with complex steps such as joins (particularly those between large datasets) and sorts
- Lengthy scripts
- In high availability mode, the total number of connections across all nodes should meet the appropriate requirements in the above table. For each node, please divide the number of connections by the number of
D s item item nodes
Amazon Marketplace AMI
Microsoft Azure installations support a limited range of installation options, based on the type of cluster integration.
Please use the Enterprise Hadoop guidelines listed previously.
For more information on this integration, see Configure for HDInsight in the Configuration Guide.
Please review the Enterprise Hadoop guidelines with
For more information on this integration, see Configure for Azure Databricks in the Configuration Guide.