Interface ClusterInfo

Describes all of the metadata about a single Spark cluster in .

interface ClusterInfo {
    autoterminationMinutes?: number;
    awsAttributes?: AwsAttributes;
    azureAttributes?: AzureAttributes;
    clusterCores?: number;
    clusterId?: string;
    clusterLogConf?: ClusterLogConf;
    clusterLogStatus?: LogSyncStatus;
    clusterMemoryMb?: bigint;
    clusterName?: string;
    creatorUserName?: string;
    customTags?: Record<string, string>;
    dataSecurityMode?: DataSecurityMode;
    defaultTags?: Record<string, string>;
    dependencyMode?: DependencyMode;
    dockerImage?: DockerImage;
    driver?: SparkInfo_SparkNode;
    driverInstancePoolId?: string;
    driverNodeTypeFlexibility?: NodeTypeFlexibility;
    driverNodeTypeId?: string;
    enableElasticDisk?: boolean;
    enableLocalDiskEncryption?: boolean;
    executors?: SparkInfo_SparkNode[];
    gcpAttributes?: GcpAttributes;
    initScripts?: InitScriptInfo[];
    instancePoolId?: string;
    isSingleNode?: boolean;
    jdbcPort?: number;
    kind?: ComputeKind;
    lastRestartedTime?: bigint;
    lastStateLossTime?: bigint;
    nodeTypeId?: string;
    policyId?: string;
    remoteDiskThroughput?: number;
    runtimeEngine?: RuntimeEngine;
    singleUserName?: string;
    size?:
        | { $case: "numWorkers"; numWorkers: number }
        | { $case: "autoscale"; autoscale: AutoScale };
    sparkConf?: Record<string, string>;
    sparkContextId?: bigint;
    sparkEnvVars?: Record<string, string>;
    sparkVersion?: string;
    spec?: ClusterInfo_ComputeSpec;
    sshPublicKeys?: string[];
    startTime?: bigint;
    state?: ClusterState_ClusterState;
    stateMessage?: string;
    terminatedTime?: bigint;
    terminationReason?: TerminationReason;
    totalInitialRemoteDiskSize?: number;
    useMlRuntime?: boolean;
    workerNodeTypeFlexibility?: NodeTypeFlexibility;
    workloadType?: WorkloadType;
}

Properties

`Optional`autoterminationMinutes

autoterminationMinutes?: number

Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. Users can also set this value to 0 to explicitly disable automatic termination.

`Optional`awsAttributes

awsAttributes?: AwsAttributes

Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.

`Optional`azureAttributes

azureAttributes?: AzureAttributes

Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used.

`Optional`clusterCores

clusterCores?: number

Number of CPU cores available for this cluster. Note that this can be fractional, e.g. 7.5 cores, since certain node types are configured to share cores between Spark nodes on the same instance.

`Optional`clusterId

clusterId?: string

Canonical identifier for the cluster. This id is retained during cluster restarts and resizes, while each new cluster has a globally unique id.

`Optional`clusterLogConf

clusterLogConf?: ClusterLogConf

The configuration for delivering spark logs to a long-term storage destination. Three kinds of destinations (DBFS, S3 and Unity Catalog volumes) are supported. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every 5 mins. The destination of driver logs is $destination/$clusterId/driver, while the destination of executor logs is $destination/$clusterId/executor.

`Optional`clusterLogStatus

clusterLogStatus?: LogSyncStatus

Cluster log delivery status.

`Optional`clusterMemoryMb

clusterMemoryMb?: bigint

Total amount of cluster memory, in megabytes

`Optional`clusterName

clusterName?: string

Cluster name requested by the user. This doesn't have to be unique. If not specified at creation, the cluster name will be an empty string. For job clusters, the cluster name is automatically set based on the job and job run IDs.

`Optional`creatorUserName

creatorUserName?: string

Creator user name. The field won't be included in the response if the user has already been deleted.

`Optional`customTags

customTags?: Record<string, string>

Additional tags for cluster resources. will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

Currently, allows at most 45 custom tags
Clusters can only reuse cloud resources if the resources' tags are a subset of the cluster tags

`Optional`dataSecurityMode

dataSecurityMode?: DataSecurityMode

`Optional`defaultTags

defaultTags?: Record<string, string>

Tags that are added by regardless of any custom_tags, including:

Vendor:
Creator: <username_of_creator>
ClusterName: <name_of_cluster>
ClusterId: <id_of_cluster>
Name: < internal use>

`Optional`dependencyMode

dependencyMode?: DependencyMode

Controls dependency configuration for the cluster.

`Optional`dockerImage

dockerImage?: DockerImage

Custom docker image BYOC

`Optional`driver

driver?: SparkInfo_SparkNode

Node on which the Spark driver resides. The driver node contains the Spark master and the application that manages the per-notebook Spark REPLs.

`Optional`driverInstancePoolId

driverInstancePoolId?: string

The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.

`Optional`driverNodeTypeFlexibility

driverNodeTypeFlexibility?: NodeTypeFlexibility

Flexible node type configuration for the driver node.

`Optional`driverNodeTypeId

driverNodeTypeId?: string

The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above.

This field, along with node_type_id, should not be set if virtual_cluster_size is set. If both driver_node_type_id, node_type_id, and virtual_cluster_size are specified, driver_node_type_id and node_type_id take precedence.

`Optional`enableElasticDisk

enableElasticDisk?: boolean

Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space.

`Optional`enableLocalDiskEncryption

enableLocalDiskEncryption?: boolean

Whether to enable LUKS on cluster VMs' local disks

`Optional`executors

executors?: SparkInfo_SparkNode[]

Nodes on which the Spark executors reside.

`Optional`gcpAttributes

gcpAttributes?: GcpAttributes

Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used.

`Optional`initScripts

initScripts?: InitScriptInfo[]

The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If cluster_log_conf is specified, init script logs are sent to <destination>/<cluster-ID>/init_scripts.

`Optional`instancePoolId

instancePoolId?: string

The optional ID of the instance pool to which the cluster belongs.

`Optional`isSingleNode

isSingleNode?: boolean

This field can only be used when kind = CLASSIC_PREVIEW.

When set to true, will automatically set single node related custom_tags, spark_conf, and num_workers

`Optional`jdbcPort

jdbcPort?: number

Port on which Spark JDBC server is listening, in the driver nod. No service will be listeningon on this port in executor nodes.

`Optional`kind

kind?: ComputeKind

`Optional`lastRestartedTime

lastRestartedTime?: bigint

the timestamp that the cluster was started/restarted

`Optional`lastStateLossTime

lastStateLossTime?: bigint

Time when the cluster driver last lost its state (due to a restart or driver failure).

`Optional`nodeTypeId

nodeTypeId?: string

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the clusters/listNodeTypes API call.

`Optional`policyId

policyId?: string

The ID of the cluster policy used to create the cluster if applicable.

`Optional`remoteDiskThroughput

remoteDiskThroughput?: number

If set, what the configurable throughput (in Mb/s) for the remote disk is. Currently only supported for GCP HYPERDISK_BALANCED disks.

`Optional`runtimeEngine

runtimeEngine?: RuntimeEngine

Determines the cluster's runtime engine, either standard or Photon.

This field is not compatible with legacy spark_version values that contain -photon-. Remove -photon- from the spark_version and set runtime_engine to PHOTON.

If left unspecified, the runtime engine defaults to standard unless the spark_version contains -photon-, in which case Photon will be used.

`Optional`singleUserName

singleUserName?: string

Single user name if data_security_mode is SINGLE_USER

`Optional`size

size?:
| { $case: "numWorkers"; numWorkers: number }
| { $case: "autoscale"; autoscale: AutoScale }

Type Declaration

{ $case: "numWorkers"; numWorkers: number }
- $case: "numWorkers"
- numWorkers: number
  Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.
  
  Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.
{ $case: "autoscale"; autoscale: AutoScale }
- $case: "autoscale"
- autoscale: AutoScale
  Parameters needed in order to automatically scale clusters up and down based on load. Note: autoscaling works best with DB runtime versions 3.0 or later.

`Optional`sparkConf

sparkConf?: Record<string, string>

An object containing a set of optional, user-specified Spark configuration key-value pairs. Users can also pass in a string of extra JVM options to the driver and the executors via spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively.

`Optional`sparkContextId

sparkContextId?: bigint

A canonical SparkContext identifier. This value does change when the Spark driver restarts. The pair (cluster_id, spark_context_id) is a globally unique identifier over all Spark contexts.

`Optional`sparkEnvVars

sparkEnvVars?: Record<string, string>

An object containing a set of optional, user-specified environment variable key-value pairs. Please note that key-value pair of the form (X,Y) will be exported as is (i.e., export X='Y') while launching the driver and workers.

In order to specify an additional set of SPARK_DAEMON_JAVA_OPTS, we recommend appending them to $SPARK_DAEMON_JAVA_OPTS as shown in the example below. This ensures that all default databricks managed environmental variables are included as well.

Example Spark environment variables: {"SPARK_WORKER_MEMORY": "28000m", "SPARK_LOCAL_DIRS": "/local_disk0"} or {"SPARK_DAEMON_JAVA_OPTS": "$SPARK_DAEMON_JAVA_OPTS -Dspark.shuffle.service.enabled=true"}

`Optional`sparkVersion

sparkVersion?: string

The Spark version of the cluster, e.g. 3.3.x-scala2.11. A list of available Spark versions can be retrieved by using the clusters/sparkVersions API call.

`Optional`spec

spec?: ClusterInfo_ComputeSpec

The spec contains a snapshot of the latest user specified settings that were used to create/edit the cluster. Note: not included in the response of the ListClusters API.

`Optional`sshPublicKeys

sshPublicKeys?: string[]

SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.

`Optional`startTime

startTime?: bigint

Time (in epoch milliseconds) when the cluster creation request was received (when the cluster entered a PENDING state).

`Optional`state

state?: ClusterState_ClusterState

Current state of the cluster.

`Optional`stateMessage

stateMessage?: string

A message associated with the most recent state transition (e.g., the reason why the cluster entered a TERMINATED state).

`Optional`terminatedTime

terminatedTime?: bigint

Time (in epoch milliseconds) when the cluster was terminated, if applicable.

`Optional`terminationReason

terminationReason?: TerminationReason

Information about why the cluster was terminated. This field only appears when the cluster is in a TERMINATING or TERMINATED state.

`Optional`totalInitialRemoteDiskSize

totalInitialRemoteDiskSize?: number

If set, what the total initial volume size (in GB) of the remote disks should be. Currently only supported for GCP HYPERDISK_BALANCED disks.

`Optional`useMlRuntime

useMlRuntime?: boolean

This field can only be used when kind = CLASSIC_PREVIEW.

effective_spark_version is determined by spark_version (DBR release), this field use_ml_runtime, and whether node_type_id is gpu node or not.

`Optional`workerNodeTypeFlexibility

workerNodeTypeFlexibility?: NodeTypeFlexibility

Flexible node type configuration for worker nodes.

`Optional`workloadType

workloadType?: WorkloadType

Interface ClusterInfo

Index

Properties

Properties

OptionalautoterminationMinutes

OptionalawsAttributes

OptionalazureAttributes

OptionalclusterCores

OptionalclusterId

OptionalclusterLogConf

OptionalclusterLogStatus

OptionalclusterMemoryMb

OptionalclusterName

OptionalcreatorUserName

OptionalcustomTags

OptionaldataSecurityMode

OptionaldefaultTags

OptionaldependencyMode

OptionaldockerImage

Optionaldriver

OptionaldriverInstancePoolId

OptionaldriverNodeTypeFlexibility

OptionaldriverNodeTypeId

OptionalenableElasticDisk

OptionalenableLocalDiskEncryption

Optionalexecutors

OptionalgcpAttributes

OptionalinitScripts

OptionalinstancePoolId

OptionalisSingleNode

OptionaljdbcPort

Optionalkind

OptionallastRestartedTime

OptionallastStateLossTime

OptionalnodeTypeId

OptionalpolicyId

OptionalremoteDiskThroughput

OptionalruntimeEngine

OptionalsingleUserName

Optionalsize

Type Declaration

$case: "numWorkers"

numWorkers: number

$case: "autoscale"

autoscale: AutoScale

OptionalsparkConf

OptionalsparkContextId

OptionalsparkEnvVars

OptionalsparkVersion

Optionalspec

OptionalsshPublicKeys

OptionalstartTime

Optionalstate

OptionalstateMessage

OptionalterminatedTime

OptionalterminationReason

OptionaltotalInitialRemoteDiskSize

OptionaluseMlRuntime

OptionalworkerNodeTypeFlexibility

OptionalworkloadType

Settings

On This Page

`Optional`autoterminationMinutes

`Optional`awsAttributes

`Optional`azureAttributes

`Optional`clusterCores

`Optional`clusterId

`Optional`clusterLogConf

`Optional`clusterLogStatus

`Optional`clusterMemoryMb

`Optional`clusterName

`Optional`creatorUserName

`Optional`customTags

`Optional`dataSecurityMode

`Optional`defaultTags

`Optional`dependencyMode

`Optional`dockerImage

`Optional`driver

`Optional`driverInstancePoolId

`Optional`driverNodeTypeFlexibility

`Optional`driverNodeTypeId

`Optional`enableElasticDisk

`Optional`enableLocalDiskEncryption

`Optional`executors

`Optional`gcpAttributes

`Optional`initScripts

`Optional`instancePoolId

`Optional`isSingleNode

`Optional`jdbcPort

`Optional`kind

`Optional`lastRestartedTime

`Optional`lastStateLossTime

`Optional`nodeTypeId

`Optional`policyId

`Optional`remoteDiskThroughput

`Optional`runtimeEngine

`Optional`singleUserName

`Optional`size

`Optional`sparkConf

`Optional`sparkContextId

`Optional`sparkEnvVars

`Optional`sparkVersion

`Optional`spec

`Optional`sshPublicKeys

`Optional`startTime

`Optional`state

`Optional`stateMessage

`Optional`terminatedTime

`Optional`terminationReason

`Optional`totalInitialRemoteDiskSize

`Optional`useMlRuntime

`Optional`workerNodeTypeFlexibility

`Optional`workloadType