Interface PipelineCluster

interface PipelineCluster {
    applyPolicyDefaultValues?: boolean;
    awsAttributes?: PipelinesAwsAttributes;
    azureAttributes?: PipelinesAzureAttributes;
    clusterLogConf?: PipelinesClusterLogConf;
    customTags?: Record<string, string>;
    driverInstancePoolId?: string;
    driverNodeTypeId?: string;
    enableLocalDiskEncryption?: boolean;
    gcpAttributes?: PipelinesGcpAttributes;
    initScripts?: PipelinesInitScriptInfo[];
    instancePoolId?: string;
    label?: string;
    nodeTypeId?: string;
    policyId?: string;
    size?:
        | { $case: "numWorkers"; numWorkers: number }
        | { $case: "autoscale"; autoscale: PipelinesAutoScale };
    sparkConf?: Record<string, string>;
    sparkEnvVars?: Record<string, string>;
    sshPublicKeys?: string[];
}

Index

Properties

applyPolicyDefaultValues? awsAttributes? azureAttributes? clusterLogConf? customTags? driverInstancePoolId? driverNodeTypeId? enableLocalDiskEncryption? gcpAttributes? initScripts? instancePoolId? label? nodeTypeId? policyId? size? sparkConf? sparkEnvVars? sshPublicKeys?

Properties

`Optional`applyPolicyDefaultValues

applyPolicyDefaultValues?: boolean

Note: This field won't be persisted. Only API users will check this field.

`Optional`awsAttributes

awsAttributes?: PipelinesAwsAttributes

Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.

`Optional`azureAttributes

azureAttributes?: PipelinesAzureAttributes

Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used.

`Optional`clusterLogConf

clusterLogConf?: PipelinesClusterLogConf

The configuration for delivering spark logs to a long-term storage destination. Only dbfs destinations are supported. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every 5 mins. The destination of driver logs is $destination/$clusterId/driver, while the destination of executor logs is $destination/$clusterId/executor.

`Optional`customTags

customTags?: Record<string, string>

Additional tags for cluster resources. will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

Currently, allows at most 45 custom tags
Clusters can only reuse cloud resources if the resources' tags are a subset of the cluster tags

`Optional`driverInstancePoolId

driverInstancePoolId?: string

The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.

`Optional`driverNodeTypeId

driverNodeTypeId?: string

The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above.

`Optional`enableLocalDiskEncryption

enableLocalDiskEncryption?: boolean

Whether to enable local disk encryption for the cluster.

`Optional`gcpAttributes

gcpAttributes?: PipelinesGcpAttributes

Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used.

`Optional`initScripts

initScripts?: PipelinesInitScriptInfo[]

The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If cluster_log_conf is specified, init script logs are sent to <destination>/<cluster-ID>/init_scripts.

`Optional`instancePoolId

instancePoolId?: string

The optional ID of the instance pool to which the cluster belongs.

`Optional`label

label?: string

A label for the cluster specification, either default to configure the default cluster, or maintenance to configure the maintenance cluster. This field is optional. The default value is default.

`Optional`nodeTypeId

nodeTypeId?: string

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

`Optional`policyId

policyId?: string

The ID of the cluster policy used to create the cluster if applicable.

`Optional`size

size?:
| { $case: "numWorkers"; numWorkers: number }
| { $case: "autoscale"; autoscale: PipelinesAutoScale }

Type Declaration

{ $case: "numWorkers"; numWorkers: number }
- $case: "numWorkers"
- numWorkers: number
  Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.
  
  Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.
{ $case: "autoscale"; autoscale: PipelinesAutoScale }
- $case: "autoscale"
- autoscale: PipelinesAutoScale
  Parameters needed in order to automatically scale clusters up and down based on load. Note: autoscaling works best with DB runtime versions 3.0 or later.

`Optional`sparkConf

sparkConf?: Record<string, string>

An object containing a set of optional, user-specified Spark configuration key-value pairs. See :method:clusters/create for more details.

`Optional`sparkEnvVars

sparkEnvVars?: Record<string, string>

An object containing a set of optional, user-specified environment variable key-value pairs. Please note that key-value pair of the form (X,Y) will be exported as is (i.e., export X='Y') while launching the driver and workers.

In order to specify an additional set of SPARK_DAEMON_JAVA_OPTS, we recommend appending them to $SPARK_DAEMON_JAVA_OPTS as shown in the example below. This ensures that all default databricks managed environmental variables are included as well.

Example Spark environment variables: {"SPARK_WORKER_MEMORY": "28000m", "SPARK_LOCAL_DIRS": "/local_disk0"} or {"SPARK_DAEMON_JAVA_OPTS": "$SPARK_DAEMON_JAVA_OPTS -Dspark.shuffle.service.enabled=true"}

`Optional`sshPublicKeys

sshPublicKeys?: string[]

SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.

Interface PipelineCluster

Index

Properties

Properties

`Optional`applyPolicyDefaultValues

`Optional`awsAttributes

`Optional`azureAttributes

`Optional`clusterLogConf

`Optional`customTags

`Optional`driverInstancePoolId

`Optional`driverNodeTypeId

`Optional`enableLocalDiskEncryption

`Optional`gcpAttributes

`Optional`initScripts

`Optional`instancePoolId

`Optional`label

`Optional`nodeTypeId

`Optional`policyId

`Optional`size

Type Declaration

$case: "numWorkers"

numWorkers: number

$case: "autoscale"

autoscale: PipelinesAutoScale

`Optional`sparkConf

`Optional`sparkEnvVars

`Optional`sshPublicKeys

Settings

On This Page

Interface PipelineCluster

Index

Properties

Properties

OptionalapplyPolicyDefaultValues

OptionalawsAttributes

OptionalazureAttributes

OptionalclusterLogConf

OptionalcustomTags

OptionaldriverInstancePoolId

OptionaldriverNodeTypeId

OptionalenableLocalDiskEncryption

OptionalgcpAttributes

OptionalinitScripts

OptionalinstancePoolId

Optionallabel

OptionalnodeTypeId

OptionalpolicyId

Optionalsize

Type Declaration

$case: "numWorkers"

numWorkers: number

$case: "autoscale"

autoscale: PipelinesAutoScale

OptionalsparkConf

OptionalsparkEnvVars

OptionalsshPublicKeys

Settings

On This Page

`Optional`applyPolicyDefaultValues

`Optional`awsAttributes

`Optional`azureAttributes

`Optional`clusterLogConf

`Optional`customTags

`Optional`driverInstancePoolId

`Optional`driverNodeTypeId

`Optional`enableLocalDiskEncryption

`Optional`gcpAttributes

`Optional`initScripts

`Optional`instancePoolId

`Optional`label

`Optional`nodeTypeId

`Optional`policyId

`Optional`size

`Optional`sparkConf

`Optional`sparkEnvVars

`Optional`sshPublicKeys