Databricks SDK for JavaScript
    Preparing search index...

    Variable TerminationCodeConst

    TerminationCode: {
        ABUSE_DETECTED: "ABUSE_DETECTED";
        ACCESS_TOKEN_FAILURE: "ACCESS_TOKEN_FAILURE";
        ALLOCATION_TIMEOUT: "ALLOCATION_TIMEOUT";
        ALLOCATION_TIMEOUT_NO_HEALTHY_AND_WARMED_UP_CLUSTERS: "ALLOCATION_TIMEOUT_NO_HEALTHY_AND_WARMED_UP_CLUSTERS";
        ALLOCATION_TIMEOUT_NO_HEALTHY_CLUSTERS: "ALLOCATION_TIMEOUT_NO_HEALTHY_CLUSTERS";
        ALLOCATION_TIMEOUT_NO_MATCHED_CLUSTERS: "ALLOCATION_TIMEOUT_NO_MATCHED_CLUSTERS";
        ALLOCATION_TIMEOUT_NO_READY_CLUSTERS: "ALLOCATION_TIMEOUT_NO_READY_CLUSTERS";
        ALLOCATION_TIMEOUT_NO_UNALLOCATED_CLUSTERS: "ALLOCATION_TIMEOUT_NO_UNALLOCATED_CLUSTERS";
        ALLOCATION_TIMEOUT_NO_WARMED_UP_CLUSTERS: "ALLOCATION_TIMEOUT_NO_WARMED_UP_CLUSTERS";
        ALLOCATION_TIMEOUT_NODE_DAEMON_NOT_READY: "ALLOCATION_TIMEOUT_NODE_DAEMON_NOT_READY";
        ATTACH_PROJECT_FAILURE: "ATTACH_PROJECT_FAILURE";
        AWS_AUTHORIZATION_FAILURE: "AWS_AUTHORIZATION_FAILURE";
        AWS_INACCESSIBLE_KMS_KEY_FAILURE: "AWS_INACCESSIBLE_KMS_KEY_FAILURE";
        AWS_INSTANCE_PROFILE_UPDATE_FAILURE: "AWS_INSTANCE_PROFILE_UPDATE_FAILURE";
        AWS_INSUFFICIENT_FREE_ADDRESSES_IN_SUBNET_FAILURE: "AWS_INSUFFICIENT_FREE_ADDRESSES_IN_SUBNET_FAILURE";
        AWS_INSUFFICIENT_INSTANCE_CAPACITY_FAILURE: "AWS_INSUFFICIENT_INSTANCE_CAPACITY_FAILURE";
        AWS_INVALID_KEY_PAIR: "AWS_INVALID_KEY_PAIR";
        AWS_INVALID_KMS_KEY_STATE: "AWS_INVALID_KMS_KEY_STATE";
        AWS_MAX_SPOT_INSTANCE_COUNT_EXCEEDED_FAILURE: "AWS_MAX_SPOT_INSTANCE_COUNT_EXCEEDED_FAILURE";
        AWS_REQUEST_LIMIT_EXCEEDED: "AWS_REQUEST_LIMIT_EXCEEDED";
        AWS_RESOURCE_QUOTA_EXCEEDED: "AWS_RESOURCE_QUOTA_EXCEEDED";
        AWS_UNSUPPORTED_FAILURE: "AWS_UNSUPPORTED_FAILURE";
        AZURE_BYOK_KEY_PERMISSION_FAILURE: "AZURE_BYOK_KEY_PERMISSION_FAILURE";
        AZURE_EPHEMERAL_DISK_FAILURE: "AZURE_EPHEMERAL_DISK_FAILURE";
        AZURE_INVALID_DEPLOYMENT_TEMPLATE: "AZURE_INVALID_DEPLOYMENT_TEMPLATE";
        AZURE_OPERATION_NOT_ALLOWED_EXCEPTION: "AZURE_OPERATION_NOT_ALLOWED_EXCEPTION";
        AZURE_PACKED_DEPLOYMENT_PARTIAL_FAILURE: "AZURE_PACKED_DEPLOYMENT_PARTIAL_FAILURE";
        AZURE_QUOTA_EXCEEDED_EXCEPTION: "AZURE_QUOTA_EXCEEDED_EXCEPTION";
        AZURE_RESOURCE_MANAGER_THROTTLING: "AZURE_RESOURCE_MANAGER_THROTTLING";
        AZURE_RESOURCE_PROVIDER_THROTTLING: "AZURE_RESOURCE_PROVIDER_THROTTLING";
        AZURE_UNEXPECTED_DEPLOYMENT_TEMPLATE_FAILURE: "AZURE_UNEXPECTED_DEPLOYMENT_TEMPLATE_FAILURE";
        AZURE_VM_EXTENSION_FAILURE: "AZURE_VM_EXTENSION_FAILURE";
        AZURE_VNET_CONFIGURATION_FAILURE: "AZURE_VNET_CONFIGURATION_FAILURE";
        BOOTSTRAP_TIMEOUT: "BOOTSTRAP_TIMEOUT";
        BOOTSTRAP_TIMEOUT_CLOUD_PROVIDER_EXCEPTION: "BOOTSTRAP_TIMEOUT_CLOUD_PROVIDER_EXCEPTION";
        BOOTSTRAP_TIMEOUT_DUE_TO_MISCONFIG: "BOOTSTRAP_TIMEOUT_DUE_TO_MISCONFIG";
        BUDGET_POLICY_LIMIT_ENFORCEMENT_ACTIVATED: "BUDGET_POLICY_LIMIT_ENFORCEMENT_ACTIVATED";
        BUDGET_POLICY_RESOLUTION_FAILURE: "BUDGET_POLICY_RESOLUTION_FAILURE";
        CLOUD_ACCOUNT_POD_QUOTA_EXCEEDED: "CLOUD_ACCOUNT_POD_QUOTA_EXCEEDED";
        CLOUD_ACCOUNT_SETUP_FAILURE: "CLOUD_ACCOUNT_SETUP_FAILURE";
        CLOUD_OPERATION_CANCELLED: "CLOUD_OPERATION_CANCELLED";
        CLOUD_PROVIDER_DISK_SETUP_FAILURE: "CLOUD_PROVIDER_DISK_SETUP_FAILURE";
        CLOUD_PROVIDER_INSTANCE_NOT_LAUNCHED: "CLOUD_PROVIDER_INSTANCE_NOT_LAUNCHED";
        CLOUD_PROVIDER_LAUNCH_FAILURE: "CLOUD_PROVIDER_LAUNCH_FAILURE";
        CLOUD_PROVIDER_LAUNCH_FAILURE_DUE_TO_MISCONFIG: "CLOUD_PROVIDER_LAUNCH_FAILURE_DUE_TO_MISCONFIG";
        CLOUD_PROVIDER_RESOURCE_STOCKOUT: "CLOUD_PROVIDER_RESOURCE_STOCKOUT";
        CLOUD_PROVIDER_RESOURCE_STOCKOUT_DUE_TO_MISCONFIG: "CLOUD_PROVIDER_RESOURCE_STOCKOUT_DUE_TO_MISCONFIG";
        CLOUD_PROVIDER_SHUTDOWN: "CLOUD_PROVIDER_SHUTDOWN";
        CLUSTER_OPERATION_THROTTLED: "CLUSTER_OPERATION_THROTTLED";
        CLUSTER_OPERATION_TIMEOUT: "CLUSTER_OPERATION_TIMEOUT";
        COMMUNICATION_LOST: "COMMUNICATION_LOST";
        CONTAINER_LAUNCH_FAILURE: "CONTAINER_LAUNCH_FAILURE";
        CONTROL_PLANE_CONNECTION_FAILURE: "CONTROL_PLANE_CONNECTION_FAILURE";
        CONTROL_PLANE_CONNECTION_FAILURE_DUE_TO_MISCONFIG: "CONTROL_PLANE_CONNECTION_FAILURE_DUE_TO_MISCONFIG";
        CONTROL_PLANE_REQUEST_FAILURE: "CONTROL_PLANE_REQUEST_FAILURE";
        CONTROL_PLANE_REQUEST_FAILURE_DUE_TO_MISCONFIG: "CONTROL_PLANE_REQUEST_FAILURE_DUE_TO_MISCONFIG";
        DATA_ACCESS_CONFIG_CHANGED: "DATA_ACCESS_CONFIG_CHANGED";
        DATABASE_CONNECTION_FAILURE: "DATABASE_CONNECTION_FAILURE";
        DBFS_COMPONENT_UNHEALTHY: "DBFS_COMPONENT_UNHEALTHY";
        DBR_IMAGE_RESOLUTION_FAILURE: "DBR_IMAGE_RESOLUTION_FAILURE";
        DISASTER_RECOVERY_REPLICATION: "DISASTER_RECOVERY_REPLICATION";
        DNS_RESOLUTION_ERROR: "DNS_RESOLUTION_ERROR";
        DOCKER_CONTAINER_CREATION_EXCEPTION: "DOCKER_CONTAINER_CREATION_EXCEPTION";
        DOCKER_IMAGE_PULL_FAILURE: "DOCKER_IMAGE_PULL_FAILURE";
        DOCKER_IMAGE_TOO_LARGE_FOR_INSTANCE_EXCEPTION: "DOCKER_IMAGE_TOO_LARGE_FOR_INSTANCE_EXCEPTION";
        DOCKER_INVALID_OS_EXCEPTION: "DOCKER_INVALID_OS_EXCEPTION";
        DRIVER_EVICTION: "DRIVER_EVICTION";
        DRIVER_LAUNCH_TIMEOUT: "DRIVER_LAUNCH_TIMEOUT";
        DRIVER_NODE_UNREACHABLE: "DRIVER_NODE_UNREACHABLE";
        DRIVER_OUT_OF_DISK: "DRIVER_OUT_OF_DISK";
        DRIVER_OUT_OF_MEMORY: "DRIVER_OUT_OF_MEMORY";
        DRIVER_POD_CREATION_FAILURE: "DRIVER_POD_CREATION_FAILURE";
        DRIVER_UNEXPECTED_FAILURE: "DRIVER_UNEXPECTED_FAILURE";
        DRIVER_UNHEALTHY: "DRIVER_UNHEALTHY";
        DRIVER_UNREACHABLE: "DRIVER_UNREACHABLE";
        DRIVER_UNRESPONSIVE: "DRIVER_UNRESPONSIVE";
        DYNAMIC_SPARK_CONF_SIZE_EXCEEDED: "DYNAMIC_SPARK_CONF_SIZE_EXCEEDED";
        EOS_SPARK_IMAGE: "EOS_SPARK_IMAGE";
        EXECUTION_COMPONENT_UNHEALTHY: "EXECUTION_COMPONENT_UNHEALTHY";
        EXECUTOR_POD_UNSCHEDULED: "EXECUTOR_POD_UNSCHEDULED";
        GCP_API_RATE_QUOTA_EXCEEDED: "GCP_API_RATE_QUOTA_EXCEEDED";
        GCP_DENIED_BY_ORG_POLICY: "GCP_DENIED_BY_ORG_POLICY";
        GCP_FORBIDDEN: "GCP_FORBIDDEN";
        GCP_IAM_TIMEOUT: "GCP_IAM_TIMEOUT";
        GCP_INACCESSIBLE_KMS_KEY_FAILURE: "GCP_INACCESSIBLE_KMS_KEY_FAILURE";
        GCP_INSUFFICIENT_CAPACITY: "GCP_INSUFFICIENT_CAPACITY";
        GCP_IP_SPACE_EXHAUSTED: "GCP_IP_SPACE_EXHAUSTED";
        GCP_KMS_KEY_PERMISSION_DENIED: "GCP_KMS_KEY_PERMISSION_DENIED";
        GCP_NOT_FOUND: "GCP_NOT_FOUND";
        GCP_QUOTA_EXCEEDED: "GCP_QUOTA_EXCEEDED";
        GCP_RESOURCE_QUOTA_EXCEEDED: "GCP_RESOURCE_QUOTA_EXCEEDED";
        GCP_SERVICE_ACCOUNT_ACCESS_DENIED: "GCP_SERVICE_ACCOUNT_ACCESS_DENIED";
        GCP_SERVICE_ACCOUNT_DELETED: "GCP_SERVICE_ACCOUNT_DELETED";
        GCP_SERVICE_ACCOUNT_NOT_FOUND: "GCP_SERVICE_ACCOUNT_NOT_FOUND";
        GCP_SUBNET_NOT_READY: "GCP_SUBNET_NOT_READY";
        GCP_TRUSTED_IMAGE_PROJECTS_VIOLATED: "GCP_TRUSTED_IMAGE_PROJECTS_VIOLATED";
        GKE_BASED_CLUSTER_TERMINATION: "GKE_BASED_CLUSTER_TERMINATION";
        GLOBAL_INIT_SCRIPT_FAILURE: "GLOBAL_INIT_SCRIPT_FAILURE";
        HIVE_METASTORE_PROVISIONING_FAILURE: "HIVE_METASTORE_PROVISIONING_FAILURE";
        HIVEMETASTORE_CONNECTIVITY_FAILURE: "HIVEMETASTORE_CONNECTIVITY_FAILURE";
        IMAGE_PULL_PERMISSION_DENIED: "IMAGE_PULL_PERMISSION_DENIED";
        IN_PENALTY_BOX: "IN_PENALTY_BOX";
        INACTIVITY: "INACTIVITY";
        INIT_CONTAINER_NOT_FINISHED: "INIT_CONTAINER_NOT_FINISHED";
        INIT_SCRIPT_FAILURE: "INIT_SCRIPT_FAILURE";
        INSTANCE_POOL_CLUSTER_FAILURE: "INSTANCE_POOL_CLUSTER_FAILURE";
        INSTANCE_POOL_MAX_CAPACITY_REACHED: "INSTANCE_POOL_MAX_CAPACITY_REACHED";
        INSTANCE_POOL_NOT_FOUND: "INSTANCE_POOL_NOT_FOUND";
        INSTANCE_UNREACHABLE: "INSTANCE_UNREACHABLE";
        INSTANCE_UNREACHABLE_DUE_TO_MISCONFIG: "INSTANCE_UNREACHABLE_DUE_TO_MISCONFIG";
        INTERNAL_CAPACITY_FAILURE: "INTERNAL_CAPACITY_FAILURE";
        INTERNAL_ERROR: "INTERNAL_ERROR";
        INVALID_ARGUMENT: "INVALID_ARGUMENT";
        INVALID_AWS_PARAMETER: "INVALID_AWS_PARAMETER";
        INVALID_INSTANCE_PLACEMENT_PROTOCOL: "INVALID_INSTANCE_PLACEMENT_PROTOCOL";
        INVALID_SPARK_IMAGE: "INVALID_SPARK_IMAGE";
        INVALID_WORKER_IMAGE_FAILURE: "INVALID_WORKER_IMAGE_FAILURE";
        IP_EXHAUSTION_FAILURE: "IP_EXHAUSTION_FAILURE";
        JOB_FINISHED: "JOB_FINISHED";
        K8S_ACTIVE_POD_QUOTA_EXCEEDED: "K8S_ACTIVE_POD_QUOTA_EXCEEDED";
        K8S_AUTOSCALING_FAILURE: "K8S_AUTOSCALING_FAILURE";
        K8S_DBR_CLUSTER_LAUNCH_TIMEOUT: "K8S_DBR_CLUSTER_LAUNCH_TIMEOUT";
        LAZY_ALLOCATION_TIMEOUT: "LAZY_ALLOCATION_TIMEOUT";
        MAINTENANCE_MODE: "MAINTENANCE_MODE";
        METASTORE_COMPONENT_UNHEALTHY: "METASTORE_COMPONENT_UNHEALTHY";
        MTLS_PORT_CONNECTIVITY_FAILURE: "MTLS_PORT_CONNECTIVITY_FAILURE";
        NEPHOS_RESOURCE_MANAGEMENT: "NEPHOS_RESOURCE_MANAGEMENT";
        NETVISOR_SETUP_TIMEOUT: "NETVISOR_SETUP_TIMEOUT";
        NETWORK_CHECK_CONTROL_PLANE_FAILURE: "NETWORK_CHECK_CONTROL_PLANE_FAILURE";
        NETWORK_CHECK_CONTROL_PLANE_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_CONTROL_PLANE_FAILURE_DUE_TO_MISCONFIG";
        NETWORK_CHECK_DNS_SERVER_FAILURE: "NETWORK_CHECK_DNS_SERVER_FAILURE";
        NETWORK_CHECK_DNS_SERVER_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_DNS_SERVER_FAILURE_DUE_TO_MISCONFIG";
        NETWORK_CHECK_METADATA_ENDPOINT_FAILURE: "NETWORK_CHECK_METADATA_ENDPOINT_FAILURE";
        NETWORK_CHECK_METADATA_ENDPOINT_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_METADATA_ENDPOINT_FAILURE_DUE_TO_MISCONFIG";
        NETWORK_CHECK_MULTIPLE_COMPONENTS_FAILURE: "NETWORK_CHECK_MULTIPLE_COMPONENTS_FAILURE";
        NETWORK_CHECK_MULTIPLE_COMPONENTS_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_MULTIPLE_COMPONENTS_FAILURE_DUE_TO_MISCONFIG";
        NETWORK_CHECK_NIC_FAILURE: "NETWORK_CHECK_NIC_FAILURE";
        NETWORK_CHECK_NIC_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_NIC_FAILURE_DUE_TO_MISCONFIG";
        NETWORK_CHECK_STORAGE_FAILURE: "NETWORK_CHECK_STORAGE_FAILURE";
        NETWORK_CHECK_STORAGE_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_STORAGE_FAILURE_DUE_TO_MISCONFIG";
        NETWORK_CONFIGURATION_FAILURE: "NETWORK_CONFIGURATION_FAILURE";
        NFS_MOUNT_FAILURE: "NFS_MOUNT_FAILURE";
        NO_MATCHED_K8S: "NO_MATCHED_K8S";
        NO_MATCHED_K8S_TESTING_TAG: "NO_MATCHED_K8S_TESTING_TAG";
        NPIP_TUNNEL_SETUP_FAILURE: "NPIP_TUNNEL_SETUP_FAILURE";
        NPIP_TUNNEL_TOKEN_FAILURE: "NPIP_TUNNEL_TOKEN_FAILURE";
        POD_ASSIGNMENT_FAILURE: "POD_ASSIGNMENT_FAILURE";
        POD_SCHEDULING_FAILURE: "POD_SCHEDULING_FAILURE";
        RATE_LIMITED: "RATE_LIMITED";
        REQUEST_REJECTED: "REQUEST_REJECTED";
        REQUEST_THROTTLED: "REQUEST_THROTTLED";
        RESOURCE_USAGE_BLOCKED: "RESOURCE_USAGE_BLOCKED";
        SECRET_CREATION_FAILURE: "SECRET_CREATION_FAILURE";
        SECRET_PERMISSION_DENIED: "SECRET_PERMISSION_DENIED";
        SECRET_RESOLUTION_ERROR: "SECRET_RESOLUTION_ERROR";
        SECURITY_DAEMON_REGISTRATION_EXCEPTION: "SECURITY_DAEMON_REGISTRATION_EXCEPTION";
        SELF_BOOTSTRAP_FAILURE: "SELF_BOOTSTRAP_FAILURE";
        SERVERLESS_LONG_RUNNING_TERMINATED: "SERVERLESS_LONG_RUNNING_TERMINATED";
        SKIPPED_SLOW_NODES: "SKIPPED_SLOW_NODES";
        SLOW_IMAGE_DOWNLOAD: "SLOW_IMAGE_DOWNLOAD";
        SPARK_ERROR: "SPARK_ERROR";
        SPARK_IMAGE_DOWNLOAD_FAILURE: "SPARK_IMAGE_DOWNLOAD_FAILURE";
        SPARK_IMAGE_DOWNLOAD_THROTTLED: "SPARK_IMAGE_DOWNLOAD_THROTTLED";
        SPARK_IMAGE_NOT_FOUND: "SPARK_IMAGE_NOT_FOUND";
        SPARK_STARTUP_FAILURE: "SPARK_STARTUP_FAILURE";
        SPOT_INSTANCE_TERMINATION: "SPOT_INSTANCE_TERMINATION";
        SSH_BOOTSTRAP_FAILURE: "SSH_BOOTSTRAP_FAILURE";
        STORAGE_DOWNLOAD_FAILURE: "STORAGE_DOWNLOAD_FAILURE";
        STORAGE_DOWNLOAD_FAILURE_DUE_TO_MISCONFIG: "STORAGE_DOWNLOAD_FAILURE_DUE_TO_MISCONFIG";
        STORAGE_DOWNLOAD_FAILURE_SLOW: "STORAGE_DOWNLOAD_FAILURE_SLOW";
        STORAGE_DOWNLOAD_FAILURE_THROTTLED: "STORAGE_DOWNLOAD_FAILURE_THROTTLED";
        STS_CLIENT_SETUP_FAILURE: "STS_CLIENT_SETUP_FAILURE";
        SUBNET_EXHAUSTED_FAILURE: "SUBNET_EXHAUSTED_FAILURE";
        TEMPORARILY_UNAVAILABLE: "TEMPORARILY_UNAVAILABLE";
        TRIAL_EXPIRED: "TRIAL_EXPIRED";
        UNEXPECTED_LAUNCH_FAILURE: "UNEXPECTED_LAUNCH_FAILURE";
        UNEXPECTED_POD_RECREATION: "UNEXPECTED_POD_RECREATION";
        UNKNOWN: "UNKNOWN";
        UNSUPPORTED_INSTANCE_TYPE: "UNSUPPORTED_INSTANCE_TYPE";
        UPDATE_INSTANCE_PROFILE_FAILURE: "UPDATE_INSTANCE_PROFILE_FAILURE";
        USAGE_POLICY_ENTITLEMENT_DENIED: "USAGE_POLICY_ENTITLEMENT_DENIED";
        USER_INITIATED_VM_TERMINATION: "USER_INITIATED_VM_TERMINATION";
        USER_REQUEST: "USER_REQUEST";
        WORKER_SETUP_FAILURE: "WORKER_SETUP_FAILURE";
        WORKSPACE_CANCELLED_ERROR: "WORKSPACE_CANCELLED_ERROR";
        WORKSPACE_CONFIGURATION_ERROR: "WORKSPACE_CONFIGURATION_ERROR";
        WORKSPACE_UPDATE: "WORKSPACE_UPDATE";
    } = ...

    The status code indicating why the cluster was terminated

    Type Declaration

    • ReadonlyABUSE_DETECTED: "ABUSE_DETECTED"

      The cluster was terminated because we detected an abusive runtime behavior that violated Terms of Service or Acceptable Use Policy.

    • ReadonlyACCESS_TOKEN_FAILURE: "ACCESS_TOKEN_FAILURE"

      Failed to fetch internal PAT token required for init script installation from WSFS/UC volumes

    • ReadonlyALLOCATION_TIMEOUT: "ALLOCATION_TIMEOUT"

      Lazy allocation timeout with unknown reason.

    • ReadonlyALLOCATION_TIMEOUT_NO_HEALTHY_AND_WARMED_UP_CLUSTERS: "ALLOCATION_TIMEOUT_NO_HEALTHY_AND_WARMED_UP_CLUSTERS"

      Lazy allocation timeout. Maps to NoCandidatesHealthyAndWarmedUp.

    • ReadonlyALLOCATION_TIMEOUT_NO_HEALTHY_CLUSTERS: "ALLOCATION_TIMEOUT_NO_HEALTHY_CLUSTERS"

      Lazy allocation timeout. Maps to NoCandidatesHealthy.

    • ReadonlyALLOCATION_TIMEOUT_NO_MATCHED_CLUSTERS: "ALLOCATION_TIMEOUT_NO_MATCHED_CLUSTERS"

      Lazy allocation timeout. Maps to NoMatchedUnallocatedDbrCluster.

    • ReadonlyALLOCATION_TIMEOUT_NO_READY_CLUSTERS: "ALLOCATION_TIMEOUT_NO_READY_CLUSTERS"

      Lazy allocation timeout. Maps to NoUnallocatedReadyDbrCluster.

    • ReadonlyALLOCATION_TIMEOUT_NO_UNALLOCATED_CLUSTERS: "ALLOCATION_TIMEOUT_NO_UNALLOCATED_CLUSTERS"

      Lazy allocation timeout. Maps to NoUnallocatedDbrCluster.

    • ReadonlyALLOCATION_TIMEOUT_NO_WARMED_UP_CLUSTERS: "ALLOCATION_TIMEOUT_NO_WARMED_UP_CLUSTERS"

      Lazy allocation timeout. Maps to NoMatchedUnallocatedWarmedUpDbrCluster.

    • ReadonlyALLOCATION_TIMEOUT_NODE_DAEMON_NOT_READY: "ALLOCATION_TIMEOUT_NODE_DAEMON_NOT_READY"

      Lazy allocation timeout. Maps to NoCandidatesWithNodeDaemonK8sReady.

    • ReadonlyATTACH_PROJECT_FAILURE: "ATTACH_PROJECT_FAILURE"

      Attach projects failure

    • ReadonlyAWS_AUTHORIZATION_FAILURE: "AWS_AUTHORIZATION_FAILURE"

      Lack authorization for cluster operation. For example, awsApiErrorCode: 'AccessDenied' or 'UnauthorizedOperation'.

    • ReadonlyAWS_INACCESSIBLE_KMS_KEY_FAILURE: "AWS_INACCESSIBLE_KMS_KEY_FAILURE"

      Failed during instance bootstrap with error code Cannot convert NVMe-based dev id

    • ReadonlyAWS_INSTANCE_PROFILE_UPDATE_FAILURE: "AWS_INSTANCE_PROFILE_UPDATE_FAILURE"

      Failure to update the instance profile for the cluster.

    • ReadonlyAWS_INSUFFICIENT_FREE_ADDRESSES_IN_SUBNET_FAILURE: "AWS_INSUFFICIENT_FREE_ADDRESSES_IN_SUBNET_FAILURE"

      We don't have enough addresses in the subnet for the instances in the request.

    • ReadonlyAWS_INSUFFICIENT_INSTANCE_CAPACITY_FAILURE: "AWS_INSUFFICIENT_INSTANCE_CAPACITY_FAILURE"

      Could not find enough of the requested instance type in the requested AZ. Often related to Auto AZ.

    • ReadonlyAWS_INVALID_KEY_PAIR: "AWS_INVALID_KEY_PAIR"

      The specified key pair name does not exist.

    • ReadonlyAWS_INVALID_KMS_KEY_STATE: "AWS_INVALID_KMS_KEY_STATE"

      The KMS key provided is in an incorrect state.

    • ReadonlyAWS_MAX_SPOT_INSTANCE_COUNT_EXCEEDED_FAILURE: "AWS_MAX_SPOT_INSTANCE_COUNT_EXCEEDED_FAILURE"

      The spot instance count in an account has exceeded the limit

    • ReadonlyAWS_REQUEST_LIMIT_EXCEEDED: "AWS_REQUEST_LIMIT_EXCEEDED"

      The maximum request rate permitted by the Amazon EC2 APIs has been exceeded for your account.

    • ReadonlyAWS_RESOURCE_QUOTA_EXCEEDED: "AWS_RESOURCE_QUOTA_EXCEEDED"

      Could not find enough AWS resources to fulfill the request

    • ReadonlyAWS_UNSUPPORTED_FAILURE: "AWS_UNSUPPORTED_FAILURE"

      The request is not supported (This is a vague error code that can be thrown for a lot of reasons.)

    • ReadonlyAZURE_BYOK_KEY_PERMISSION_FAILURE: "AZURE_BYOK_KEY_PERMISSION_FAILURE"

      Legit cluster termination in Azure caused by customer revoking the key permission used for managed-disks encryption

    • ReadonlyAZURE_EPHEMERAL_DISK_FAILURE: "AZURE_EPHEMERAL_DISK_FAILURE"

      Termination because of unsupported azure ephemeral os disk setup

    • ReadonlyAZURE_INVALID_DEPLOYMENT_TEMPLATE: "AZURE_INVALID_DEPLOYMENT_TEMPLATE"

      Occurs when the deployment template we submit to Azure violates their requirements. Typical scenarios:

      • Wrong parameter key/value used
      • Exceed the limit for certain parameter
    • ReadonlyAZURE_OPERATION_NOT_ALLOWED_EXCEPTION: "AZURE_OPERATION_NOT_ALLOWED_EXCEPTION"

      NOTE: This is currently used by exceptions with messages that are classified as user errors.

    • ReadonlyAZURE_PACKED_DEPLOYMENT_PARTIAL_FAILURE: "AZURE_PACKED_DEPLOYMENT_PARTIAL_FAILURE"

      This error code is used when the cluster is terminated due to its instances fail with partial failure from Azure packed deployments. In Azure, we might pack multiple launch requests in one deployment template in order to avoid the 800 templates limit on Azure side. If the packed deployment fails multiple times, the cluster could be terminated by this [[AZURE_PACKED_DEPLOYMENT_PARTIAL_FAILURE]] termination code.

    • ReadonlyAZURE_QUOTA_EXCEEDED_EXCEPTION: "AZURE_QUOTA_EXCEEDED_EXCEPTION"

      Could not find enough azure resources to fulfill the request.

    • ReadonlyAZURE_RESOURCE_MANAGER_THROTTLING: "AZURE_RESOURCE_MANAGER_THROTTLING"

      Databricks may hit the azure resource manager request limit. Which will keep the Azure SDK from issuing any read or write request to Azure resource manager. The request limit is applied to each subscription every hour, thus retry after an hour or changing to a smaller cluster size might help to resolve the issue. Please check the following link for more information: https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits

    • ReadonlyAZURE_RESOURCE_PROVIDER_THROTTLING: "AZURE_RESOURCE_PROVIDER_THROTTLING"

      Databricks may hit the azure resource provider request limit. Specifically, the API request rate to the specific resource type (Compute, Network, etc..) can't exceed the limit. Retry might help to resolve the issue. Please check the following link for more information: https://docs.microsoft.com/en-us/azure/virtual-machines/troubleshooting/ troubleshooting-throttling-errors

    • ReadonlyAZURE_UNEXPECTED_DEPLOYMENT_TEMPLATE_FAILURE: "AZURE_UNEXPECTED_DEPLOYMENT_TEMPLATE_FAILURE"

      The set of un-categorized failure responses from Azure when we launch instance resources using deployment template

    • ReadonlyAZURE_VM_EXTENSION_FAILURE: "AZURE_VM_EXTENSION_FAILURE"

      Azure VM Extension failure during instance bootstrap

    • ReadonlyAZURE_VNET_CONFIGURATION_FAILURE: "AZURE_VNET_CONFIGURATION_FAILURE"

      Failures during azure vnet configuration. For example, a workspace with VNet injection had incorrect DNS settings that blocked access to worker artifacts.

    • ReadonlyBOOTSTRAP_TIMEOUT: "BOOTSTRAP_TIMEOUT"

      Timeout to ping the nodeDaemon, possible reason: nodeDaemon didn't start (configuration issue), network connectivity issue

    • ReadonlyBOOTSTRAP_TIMEOUT_CLOUD_PROVIDER_EXCEPTION: "BOOTSTRAP_TIMEOUT_CLOUD_PROVIDER_EXCEPTION"

      Bootstrap timeout due to Azure Extension Service Failure

    • ReadonlyBOOTSTRAP_TIMEOUT_DUE_TO_MISCONFIG: "BOOTSTRAP_TIMEOUT_DUE_TO_MISCONFIG"

      A bootstrap timeout that was caused by misconfiguration on the customer's side

    • ReadonlyBUDGET_POLICY_LIMIT_ENFORCEMENT_ACTIVATED: "BUDGET_POLICY_LIMIT_ENFORCEMENT_ACTIVATED"

      cluster terminate can happened when a budget policy limit enforcement activated

    • ReadonlyBUDGET_POLICY_RESOLUTION_FAILURE: "BUDGET_POLICY_RESOLUTION_FAILURE"

      The cluster was terminated as it failed to resolve budget policy.

    • ReadonlyCLOUD_ACCOUNT_POD_QUOTA_EXCEEDED: "CLOUD_ACCOUNT_POD_QUOTA_EXCEEDED"

      Request exceeded MAX_PODS_PER_CLOUD_ACCOUNT quota - subscription/cloud account pod limit reached

    • ReadonlyCLOUD_ACCOUNT_SETUP_FAILURE: "CLOUD_ACCOUNT_SETUP_FAILURE"

      Cloud account setup has some error (e.g. pending email verification, blocked)

    • ReadonlyCLOUD_OPERATION_CANCELLED: "CLOUD_OPERATION_CANCELLED"

      The operation on the cloud provider was cancelled. Possibly due to a user action.

    • ReadonlyCLOUD_PROVIDER_DISK_SETUP_FAILURE: "CLOUD_PROVIDER_DISK_SETUP_FAILURE"

      Failed during instance bootstrap with error code Cannot convert NVMe-based dev id

    • ReadonlyCLOUD_PROVIDER_INSTANCE_NOT_LAUNCHED: "CLOUD_PROVIDER_INSTANCE_NOT_LAUNCHED"

      If cloud provider indicates instance creation was a success, yet the instance is never created. This can happen in certain edge cases like quota exhaustion on GCP. We have an open bug here: https://partnerissuetracker.corp.google.com/issues/339061883

    • ReadonlyCLOUD_PROVIDER_LAUNCH_FAILURE: "CLOUD_PROVIDER_LAUNCH_FAILURE"

      Databricks may hit cloud provider failures when requesting instances to launch clusters. For example, AWS limits the number of running instances and EBS volumes. If you ask Databricks to launch a cluster that requires instances or EBS volumes that exceed your AWS limit, the cluster will fail with this status code. Parameters should include one of aws_api_error_code, aws_instance_state_reason, or aws_spot_request_status to indicate the AWS-provided reason why Databricks could not request the required instances for the cluster.

    • ReadonlyCLOUD_PROVIDER_LAUNCH_FAILURE_DUE_TO_MISCONFIG: "CLOUD_PROVIDER_LAUNCH_FAILURE_DUE_TO_MISCONFIG"

      CPLF, but due to misconfiguration on the customer's side

    • ReadonlyCLOUD_PROVIDER_RESOURCE_STOCKOUT: "CLOUD_PROVIDER_RESOURCE_STOCKOUT"

      Cloud provider is undergoing a transient resource throttling. This is retryable.

    • ReadonlyCLOUD_PROVIDER_RESOURCE_STOCKOUT_DUE_TO_MISCONFIG: "CLOUD_PROVIDER_RESOURCE_STOCKOUT_DUE_TO_MISCONFIG"

      The customer's repeatedly attempting to launch clusters with some configuration that the CSP's not able to provide

    • ReadonlyCLOUD_PROVIDER_SHUTDOWN: "CLOUD_PROVIDER_SHUTDOWN"

      The instance that hosted the spark driver was terminated by the cloud provider. In AWS, for example, AWS may retire instances and directly shut them down. Parameters should include an aws_instance_state_reason field indicating the AWS-provided reason why the instance was terminated.

    • ReadonlyCLUSTER_OPERATION_THROTTLED: "CLUSTER_OPERATION_THROTTLED"

      Indicates that the cloud provider operations performed for the cluster were dropped due to an influx in load in the cloud provider and had to be dropped from our end to alleviate pressure within the DelegateRpcClient. Please see go/cmloadshedding for more.

    • ReadonlyCLUSTER_OPERATION_TIMEOUT: "CLUSTER_OPERATION_TIMEOUT"

      The error code can be used to indicate a request misses its deadline. Can be used for either request timeouts or missed deadlines (i.e. a request is not completed as it was processed after its specified deadline)

    • ReadonlyCOMMUNICATION_LOST: "COMMUNICATION_LOST"

      Databricks may lose connection to services on the driver instance. One such case is when problems arise in cloud networking infrastructure, or when the instance itself becomes unhealthy.

    • ReadonlyCONTAINER_LAUNCH_FAILURE: "CONTAINER_LAUNCH_FAILURE"

      Databricks encountered an unexpected error while launching containers on worker nodes for the cluster, terminating the cluster.

    • ReadonlyCONTROL_PLANE_CONNECTION_FAILURE: "CONTROL_PLANE_CONNECTION_FAILURE"
    • ReadonlyCONTROL_PLANE_CONNECTION_FAILURE_DUE_TO_MISCONFIG: "CONTROL_PLANE_CONNECTION_FAILURE_DUE_TO_MISCONFIG"
    • ReadonlyCONTROL_PLANE_REQUEST_FAILURE: "CONTROL_PLANE_REQUEST_FAILURE"

      Bootstrap timeout due to get runbook failure

    • ReadonlyCONTROL_PLANE_REQUEST_FAILURE_DUE_TO_MISCONFIG: "CONTROL_PLANE_REQUEST_FAILURE_DUE_TO_MISCONFIG"

      CPRF, but due to misconfiguration on the customer's side

    • ReadonlyDATA_ACCESS_CONFIG_CHANGED: "DATA_ACCESS_CONFIG_CHANGED"

      The data access config of the workspace has changed, and clusters using outdated config will be terminated.

    • ReadonlyDATABASE_CONNECTION_FAILURE: "DATABASE_CONNECTION_FAILURE"

      Cluster terminated due to database failure

    • ReadonlyDBFS_COMPONENT_UNHEALTHY: "DBFS_COMPONENT_UNHEALTHY"

      DBFS component unhealthy

    • ReadonlyDBR_IMAGE_RESOLUTION_FAILURE: "DBR_IMAGE_RESOLUTION_FAILURE"

      CMv2 could not resolve the DBR image for versionless workloads (REPL, GENERIC). This typically happens when no spark version is found from the channel mapping and the workload is versionless-enabled.

    • ReadonlyDISASTER_RECOVERY_REPLICATION: "DISASTER_RECOVERY_REPLICATION"

      The cluster was terminated when the primary workspace failed over to the secondary workspace. This is expected because there is no data plane in the secondary workspace.

    • ReadonlyDNS_RESOLUTION_ERROR: "DNS_RESOLUTION_ERROR"

      The cluster was terminated because the DNS resolution failed.

    • ReadonlyDOCKER_CONTAINER_CREATION_EXCEPTION: "DOCKER_CONTAINER_CREATION_EXCEPTION"

      Something went wrong during the creation of the docker container.

    • ReadonlyDOCKER_IMAGE_PULL_FAILURE: "DOCKER_IMAGE_PULL_FAILURE"

      Container setup failure due to docker image pulling failure

    • ReadonlyDOCKER_IMAGE_TOO_LARGE_FOR_INSTANCE_EXCEPTION: "DOCKER_IMAGE_TOO_LARGE_FOR_INSTANCE_EXCEPTION"

      Customer passed in a docker image that's too large for the instance.

    • ReadonlyDOCKER_INVALID_OS_EXCEPTION: "DOCKER_INVALID_OS_EXCEPTION"

      Docker container's OS was not valid.

    • ReadonlyDRIVER_EVICTION: "DRIVER_EVICTION"

      Driver pod evicted in Nephos

    • ReadonlyDRIVER_LAUNCH_TIMEOUT: "DRIVER_LAUNCH_TIMEOUT"

      ** Only relevant on k8s dataplanes (i.e. clusters launched with CMv2 - not CMv1). Original driver pod took too long to become ready and timed out.

    • ReadonlyDRIVER_NODE_UNREACHABLE: "DRIVER_NODE_UNREACHABLE"

      CMv2 unable to contact chauffeur or node-daemon on the driver node.

    • ReadonlyDRIVER_OUT_OF_DISK: "DRIVER_OUT_OF_DISK"

      ** Only relevant on k8s dataplanes (i.e. clusters launched with CMv2 - not CMv1).

      k8s evicted the driver pod due to disk pressure on the driver node. This is likely due to a customer job consuming too much disk and so this is classified as a customer issue.

    • ReadonlyDRIVER_OUT_OF_MEMORY: "DRIVER_OUT_OF_MEMORY"

      ** Only relevant on k8s dataplanes (i.e. clusters launched with CMv2 - not CMv1).

      k8s evicted the driver pod due to memory pressure on the driver node. A customer job consuming significant amounts of memory should not be able to trigger this as the driver container would OOM first (we set memory limits on our pods). Thus this termination reason will be considered a databricks issue.

    • ReadonlyDRIVER_POD_CREATION_FAILURE: "DRIVER_POD_CREATION_FAILURE"

      Driver pod creation failure in nephos

    • ReadonlyDRIVER_UNEXPECTED_FAILURE: "DRIVER_UNEXPECTED_FAILURE"

      ** Only relevant on k8s dataplanes (i.e. clusters launched with CMv2 - not CMv1). Unexpected failure during driver pod launch.

    • ReadonlyDRIVER_UNHEALTHY: "DRIVER_UNHEALTHY"

      Driver has been down or unresponsive for an extended period of time

    • ReadonlyDRIVER_UNREACHABLE: "DRIVER_UNREACHABLE"

      The cluster was terminated because no response from the chauffeur could be received. We name this "DRIVER_" instead of "CHAUFFEUR_" since chauffeur is non-external terminology

    • ReadonlyDRIVER_UNRESPONSIVE: "DRIVER_UNRESPONSIVE"

      Driver unresponsive

    • ReadonlyDYNAMIC_SPARK_CONF_SIZE_EXCEEDED: "DYNAMIC_SPARK_CONF_SIZE_EXCEEDED"

      The cluster was terminated because the size of the dynamic spark conf exceeded the limit.

    • ReadonlyEOS_SPARK_IMAGE: "EOS_SPARK_IMAGE"
    • ReadonlyEXECUTION_COMPONENT_UNHEALTHY: "EXECUTION_COMPONENT_UNHEALTHY"

      Execution component unhealthy

    • ReadonlyEXECUTOR_POD_UNSCHEDULED: "EXECUTOR_POD_UNSCHEDULED"

      Nephos: could not acquire executor pods from pod pool

    • ReadonlyGCP_API_RATE_QUOTA_EXCEEDED: "GCP_API_RATE_QUOTA_EXCEEDED"

      Rate quota exceeded for GCP API (e.g. Read requests per minute per region).

    • ReadonlyGCP_DENIED_BY_ORG_POLICY: "GCP_DENIED_BY_ORG_POLICY"

      Org policy is preventing a GCE API operation from being executed.

    • ReadonlyGCP_FORBIDDEN: "GCP_FORBIDDEN"

      Forbidden (403) returned by GCP API.

    • ReadonlyGCP_IAM_TIMEOUT: "GCP_IAM_TIMEOUT"

      GCP Specific IAM API timeout issues during Workload Idenitity (Cluster Identity) binding process

    • ReadonlyGCP_INACCESSIBLE_KMS_KEY_FAILURE: "GCP_INACCESSIBLE_KMS_KEY_FAILURE"

      Failure due to disabled or inaccessible CMK.

    • ReadonlyGCP_INSUFFICIENT_CAPACITY: "GCP_INSUFFICIENT_CAPACITY"

      Insufficient capacity failure from GCE API.

    • ReadonlyGCP_IP_SPACE_EXHAUSTED: "GCP_IP_SPACE_EXHAUSTED"

      Subnet IP space exhausted.

    • ReadonlyGCP_KMS_KEY_PERMISSION_DENIED: "GCP_KMS_KEY_PERMISSION_DENIED"

      Failure due to missing/incorrect permission setup on CMK.

    • ReadonlyGCP_NOT_FOUND: "GCP_NOT_FOUND"

      Not found (404) returned by GCP API.

    • ReadonlyGCP_QUOTA_EXCEEDED: "GCP_QUOTA_EXCEEDED"

      Could not find enough GCP resources to fulfill the request. TODO: It's very unfortunate that we have per-cloud termination reasons while we should have cloud-agnostic termination reasons. For example, we should consolidate {AZURE_QUOTA_EXCEEDED_EXCEPTION, AWS_REQUEST_LIMIT_EXCEEDED and GCP_QUOTA_EXCEEDED}, {AWS_INSUFFICIENT_FREE_ADDRESSES_IN_SUBNET_FAILURE, IP_EXHAUSTION_FAILURE}, etc.

    • ReadonlyGCP_RESOURCE_QUOTA_EXCEEDED: "GCP_RESOURCE_QUOTA_EXCEEDED"

      Resource quota exceeded (e.g. # of n1 vCPUs in a region).

    • ReadonlyGCP_SERVICE_ACCOUNT_ACCESS_DENIED: "GCP_SERVICE_ACCOUNT_ACCESS_DENIED"

      Missing permissions to launch VM with service account.

    • ReadonlyGCP_SERVICE_ACCOUNT_DELETED: "GCP_SERVICE_ACCOUNT_DELETED"

      The GCP service account associated with the DBR cluster is deleted.

    • ReadonlyGCP_SERVICE_ACCOUNT_NOT_FOUND: "GCP_SERVICE_ACCOUNT_NOT_FOUND"

      VM attempting to launch with non-existent service account.

    • ReadonlyGCP_SUBNET_NOT_READY: "GCP_SUBNET_NOT_READY"

      GCP subnet is in transient "resourceNotReady" state.

    • ReadonlyGCP_TRUSTED_IMAGE_PROJECTS_VIOLATED: "GCP_TRUSTED_IMAGE_PROJECTS_VIOLATED"

      GCP Databricks VM Machine Image is blocked by customer organization policy.

    • ReadonlyGKE_BASED_CLUSTER_TERMINATION: "GKE_BASED_CLUSTER_TERMINATION"

      For the GCP CMv1 Migration, we will terminate all CMv2 based clusters with this failure.

    • ReadonlyGLOBAL_INIT_SCRIPT_FAILURE: "GLOBAL_INIT_SCRIPT_FAILURE"

      Databricks cannot load and execute a global init script on one of the cluster's nodes, or the init script terminates with a non-zero exit code.

    • ReadonlyHIVE_METASTORE_PROVISIONING_FAILURE: "HIVE_METASTORE_PROVISIONING_FAILURE"

      Hive Metastore provisioning failue in launch container step

    • ReadonlyHIVEMETASTORE_CONNECTIVITY_FAILURE: "HIVEMETASTORE_CONNECTIVITY_FAILURE"

      The cluster was terminated because hivemetastore connectivity check failed.

    • ReadonlyIMAGE_PULL_PERMISSION_DENIED: "IMAGE_PULL_PERMISSION_DENIED"

      Failed to pull DBR images due to permission error.

    • ReadonlyIN_PENALTY_BOX: "IN_PENALTY_BOX"

      This customer/error combination is a known issue and is intentionally excluded from termination metrics

    • ReadonlyINACTIVITY: "INACTIVITY"

      This cluster was terminated since it was idle.

    • ReadonlyINIT_CONTAINER_NOT_FINISHED: "INIT_CONTAINER_NOT_FINISHED"

      The bootstrapping init-containers in Spark failed or timed out, blocking the Spark container from bootstrapping. This is a refinement of SPARK_STARTUP_FAILURE. (init-containers are a bootstrapping step owned by Databricks)

    • ReadonlyINIT_SCRIPT_FAILURE: "INIT_SCRIPT_FAILURE"

      Databricks cannot load and execute a cluster-scoped init script on one of the cluster's nodes, or the init script terminates with a non-zero exit code or there was a general failure during the loading/executing of init scripts that does not pertain to any specific script.

    • ReadonlyINSTANCE_POOL_CLUSTER_FAILURE: "INSTANCE_POOL_CLUSTER_FAILURE"

      Instance pool backed cluster specific failure

    • ReadonlyINSTANCE_POOL_MAX_CAPACITY_REACHED: "INSTANCE_POOL_MAX_CAPACITY_REACHED"

      Attempting to launch more instances was rejected as it would exceed the pool's max capacity.

    • ReadonlyINSTANCE_POOL_NOT_FOUND: "INSTANCE_POOL_NOT_FOUND"

      The instance pool did not exist when the cluster was launched.

    • ReadonlyINSTANCE_UNREACHABLE: "INSTANCE_UNREACHABLE"

      Databricks was not able to access instances in order to start the cluster. This can be a transient networking issue. If the problem persists, this usually indicates a networking environment misconfiguration.

    • ReadonlyINSTANCE_UNREACHABLE_DUE_TO_MISCONFIG: "INSTANCE_UNREACHABLE_DUE_TO_MISCONFIG"

      Instance unreachable, but due to misconfiguration on the customer's side

    • ReadonlyINTERNAL_CAPACITY_FAILURE: "INTERNAL_CAPACITY_FAILURE"

      Nephos internal error due to insufficient provisioned k8s capacity or insufficient cloud quota

    • ReadonlyINTERNAL_ERROR: "INTERNAL_ERROR"

      Databricks encountered an unexpected error which forced the running cluster to be terminated. Please contact Databricks support for additional details.

    • ReadonlyINVALID_ARGUMENT: "INVALID_ARGUMENT"

      Cannot launch the cluster because the user specified an invalid argument. For example, the use might specify an invalid spark version for the cluster.

    • ReadonlyINVALID_AWS_PARAMETER: "INVALID_AWS_PARAMETER"

      The parameter user specified or the user account to create the cluster is invalid according to AWS.

    • ReadonlyINVALID_INSTANCE_PLACEMENT_PROTOCOL: "INVALID_INSTANCE_PLACEMENT_PROTOCOL"

      It indicates there is a placement v2 protocol rollout/rollback event for the corresponding workspace when processing the placement session on the instance-manager side. A retry will fix the issue by switching back to the correct placement protocol.

    • ReadonlyINVALID_SPARK_IMAGE: "INVALID_SPARK_IMAGE"

      Container setup failed due to an invalid Spark image.

    • ReadonlyINVALID_WORKER_IMAGE_FAILURE: "INVALID_WORKER_IMAGE_FAILURE"

      The instances acquired from a pool in IMv2 do not have a valid worker image to be used in the cluster launch. This usually occurs after AMI/VHD upgrades, worker branch updates, etc.

    • ReadonlyIP_EXHAUSTION_FAILURE: "IP_EXHAUSTION_FAILURE"

      Cluster failure due to IP space exhaustion. For example on CMv2, Kubernetes will fail to scale up new nodes if the pod IP CIDR block is exhausted.

    • ReadonlyJOB_FINISHED: "JOB_FINISHED"

      This cluster was launched by a Job, and terminated when the Job completed.

    • ReadonlyK8S_ACTIVE_POD_QUOTA_EXCEEDED: "K8S_ACTIVE_POD_QUOTA_EXCEEDED"

      Request exceeded MAX_ACTIVE_DBR_PODS_PER_K8S_CLUSTER quota - too many active pods on the K8s cluster

    • ReadonlyK8S_AUTOSCALING_FAILURE: "K8S_AUTOSCALING_FAILURE"

      K8S failed to upscale to acquire new nodes

    • ReadonlyK8S_DBR_CLUSTER_LAUNCH_TIMEOUT: "K8S_DBR_CLUSTER_LAUNCH_TIMEOUT"

      DBR Cluster launched on K8s (i.e. CMv2) has failed to start up in time

    • ReadonlyLAZY_ALLOCATION_TIMEOUT: "LAZY_ALLOCATION_TIMEOUT"

      Lazy allocation timeout. Timeout before any internal DBR clusters were allocated.

    • ReadonlyMAINTENANCE_MODE: "MAINTENANCE_MODE"

      Cluster terminated manually by on-call due to emergency maintenance

    • ReadonlyMETASTORE_COMPONENT_UNHEALTHY: "METASTORE_COMPONENT_UNHEALTHY"

      Metastore component unhealthy

    • ReadonlyMTLS_PORT_CONNECTIVITY_FAILURE: "MTLS_PORT_CONNECTIVITY_FAILURE"

      The cluster was terminated because mutual TLS port 8443 check failed.

    • ReadonlyNEPHOS_RESOURCE_MANAGEMENT: "NEPHOS_RESOURCE_MANAGEMENT"

      request comes form Nephos resource pool auto management

    • ReadonlyNETVISOR_SETUP_TIMEOUT: "NETVISOR_SETUP_TIMEOUT"

      When nephos blocking wait for netvisor setup ready signal, terminated by timeout. This error code only applies to clusters with the attribute should_block_for_network_readiness: true

    • ReadonlyNETWORK_CHECK_CONTROL_PLANE_FAILURE: "NETWORK_CHECK_CONTROL_PLANE_FAILURE"
    • ReadonlyNETWORK_CHECK_CONTROL_PLANE_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_CONTROL_PLANE_FAILURE_DUE_TO_MISCONFIG"
    • ReadonlyNETWORK_CHECK_DNS_SERVER_FAILURE: "NETWORK_CHECK_DNS_SERVER_FAILURE"
    • ReadonlyNETWORK_CHECK_DNS_SERVER_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_DNS_SERVER_FAILURE_DUE_TO_MISCONFIG"
    • ReadonlyNETWORK_CHECK_METADATA_ENDPOINT_FAILURE: "NETWORK_CHECK_METADATA_ENDPOINT_FAILURE"
    • ReadonlyNETWORK_CHECK_METADATA_ENDPOINT_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_METADATA_ENDPOINT_FAILURE_DUE_TO_MISCONFIG"
    • ReadonlyNETWORK_CHECK_MULTIPLE_COMPONENTS_FAILURE: "NETWORK_CHECK_MULTIPLE_COMPONENTS_FAILURE"
    • ReadonlyNETWORK_CHECK_MULTIPLE_COMPONENTS_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_MULTIPLE_COMPONENTS_FAILURE_DUE_TO_MISCONFIG"
    • ReadonlyNETWORK_CHECK_NIC_FAILURE: "NETWORK_CHECK_NIC_FAILURE"

      Start of network health check generated failures

    • ReadonlyNETWORK_CHECK_NIC_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_NIC_FAILURE_DUE_TO_MISCONFIG"

      Start of network health check generated failures due to misconfiguration

    • ReadonlyNETWORK_CHECK_STORAGE_FAILURE: "NETWORK_CHECK_STORAGE_FAILURE"
    • ReadonlyNETWORK_CHECK_STORAGE_FAILURE_DUE_TO_MISCONFIG: "NETWORK_CHECK_STORAGE_FAILURE_DUE_TO_MISCONFIG"
    • ReadonlyNETWORK_CONFIGURATION_FAILURE: "NETWORK_CONFIGURATION_FAILURE"

      The cluster was terminated due to an error in the network configuration.

    • ReadonlyNFS_MOUNT_FAILURE: "NFS_MOUNT_FAILURE"

      Failure when mounting remote NFS to container

    • ReadonlyNO_MATCHED_K8S: "NO_MATCHED_K8S"

      Serverless only. There are no eligible K8s for the cluster.

    • ReadonlyNO_MATCHED_K8S_TESTING_TAG: "NO_MATCHED_K8S_TESTING_TAG"

      Serverless only. The preselected K8s for the cluster is not eligible.

    • ReadonlyNPIP_TUNNEL_SETUP_FAILURE: "NPIP_TUNNEL_SETUP_FAILURE"

      Bootstrap failure due to Ngrok tunnel setup timeout or failure. For example, if the worker node is unable to reach the Ngrok tunnel domain.

    • ReadonlyNPIP_TUNNEL_TOKEN_FAILURE: "NPIP_TUNNEL_TOKEN_FAILURE"

      If the ngrok tunnel token provisioning fails for any reason, for example hitting the max capacity of allowed ngrok tokens. (ES-32083)

    • ReadonlyPOD_ASSIGNMENT_FAILURE: "POD_ASSIGNMENT_FAILURE"

      Driver or executor pod failed to finish assigning.

    • ReadonlyPOD_SCHEDULING_FAILURE: "POD_SCHEDULING_FAILURE"

      Driver or executor pod failed to be scheduled.

    • ReadonlyRATE_LIMITED: "RATE_LIMITED"
    • ReadonlyREQUEST_REJECTED: "REQUEST_REJECTED"
    • ReadonlyREQUEST_THROTTLED: "REQUEST_THROTTLED"

      Databricks cannot handle the request at this moment. Please try again later and contact Databricks if the problem persists.

    • ReadonlyRESOURCE_USAGE_BLOCKED: "RESOURCE_USAGE_BLOCKED"

      Gatekeeper indicated the cluster should be shutdown

    • ReadonlySECRET_CREATION_FAILURE: "SECRET_CREATION_FAILURE"

      Dynamic secret generation failed.

    • ReadonlySECRET_PERMISSION_DENIED: "SECRET_PERMISSION_DENIED"

      Customer passed in a secret that they do not have permissions to resolve.

    • ReadonlySECRET_RESOLUTION_ERROR: "SECRET_RESOLUTION_ERROR"

      Catch all error for all secret resolution issues in cluster launch. This should be alerted on, and is considered a server error. This can be split out into other cases if there are client errors - for e.g. INVALID_ARGUMENT is used for secrets that don't exist and permission issues

    • ReadonlySECURITY_DAEMON_REGISTRATION_EXCEPTION: "SECURITY_DAEMON_REGISTRATION_EXCEPTION"

      Container setup failed during registration to security daemon due to an unspecified error.

    • ReadonlySELF_BOOTSTRAP_FAILURE: "SELF_BOOTSTRAP_FAILURE"

      SelfBootstrap failure. Either self-bootstrap fast fail or node daemon ping timeout

    • ReadonlySERVERLESS_LONG_RUNNING_TERMINATED: "SERVERLESS_LONG_RUNNING_TERMINATED"

      This error code is used to terminate long-running Generic compute jobs in Serverless Environment as part of the NephosLongRunning watcher running in Cluster Monitor Service.

    • ReadonlySKIPPED_SLOW_NODES: "SKIPPED_SLOW_NODES"

      Cluster start successfully completed but skipped some instances which were slow to launch

    • ReadonlySLOW_IMAGE_DOWNLOAD: "SLOW_IMAGE_DOWNLOAD"

      Container launch timed out downloading the spark image. This can happen if the customer has byo-vpc/vnet and the download of large files is being throttled.

    • ReadonlySPARK_ERROR: "SPARK_ERROR"

      Spark error on startup

    • ReadonlySPARK_IMAGE_DOWNLOAD_FAILURE: "SPARK_IMAGE_DOWNLOAD_FAILURE"

      Container launch failed while downloading the spark image. Catch all for if anything goes wrong while downloading and extracting the spark tarball.

    • ReadonlySPARK_IMAGE_DOWNLOAD_THROTTLED: "SPARK_IMAGE_DOWNLOAD_THROTTLED"

      Container launch failed due to storage servers throttling our download of spark images. Can happen due to transient spikes of downloads overloading storage servers or gradual increase in usage. In the latter case we need to increase the number of storage servers in the region to help spread load.

    • ReadonlySPARK_IMAGE_NOT_FOUND: "SPARK_IMAGE_NOT_FOUND"

      The spark image specified for the cluster was not found when attempting to download. Usually due to the customer custom specifying a bad image.

    • ReadonlySPARK_STARTUP_FAILURE: "SPARK_STARTUP_FAILURE"

      The Spark driver failed to start. Possible reasons may include incompatible libraries and initialization scripts that corrupted the Spark container.

    • ReadonlySPOT_INSTANCE_TERMINATION: "SPOT_INSTANCE_TERMINATION"

      Termination because of spot instance terminated by cloud provider

    • ReadonlySSH_BOOTSTRAP_FAILURE: "SSH_BOOTSTRAP_FAILURE"

      Exception when setting up instances using ssh bootstrap

    • ReadonlySTORAGE_DOWNLOAD_FAILURE: "STORAGE_DOWNLOAD_FAILURE"

      Bootstrap timeout due to script download failure

    • ReadonlySTORAGE_DOWNLOAD_FAILURE_DUE_TO_MISCONFIG: "STORAGE_DOWNLOAD_FAILURE_DUE_TO_MISCONFIG"

      Bootstrap timeout due to script download failure, but due to misconfiguration on the customer's side

    • ReadonlySTORAGE_DOWNLOAD_FAILURE_SLOW: "STORAGE_DOWNLOAD_FAILURE_SLOW"

      Artifact download failed because it was too slow

    • ReadonlySTORAGE_DOWNLOAD_FAILURE_THROTTLED: "STORAGE_DOWNLOAD_FAILURE_THROTTLED"

      Artifact download failed because it was throttled by the download server

    • ReadonlySTS_CLIENT_SETUP_FAILURE: "STS_CLIENT_SETUP_FAILURE"

      Container setup failed during container registration to security daemon due to STS endpoint connection error.

    • ReadonlySUBNET_EXHAUSTED_FAILURE: "SUBNET_EXHAUSTED_FAILURE"

      Subnet (typically Azure vnet injected) has run out of ip addresses

    • ReadonlyTEMPORARILY_UNAVAILABLE: "TEMPORARILY_UNAVAILABLE"

      Cluster is terminated because the services are temporarily unavailable. This normally happens when CM is restarting and draining execution contexts, or IM/Delegate is overloaded, so that it will not be able to retry the instance launch request.

    • ReadonlyTRIAL_EXPIRED: "TRIAL_EXPIRED"

      The cluster was terminated because it was running in a trial workspace that expired.

    • ReadonlyUNEXPECTED_LAUNCH_FAILURE: "UNEXPECTED_LAUNCH_FAILURE"

      While launching this cluster, Databricks failed to complete critical setup steps, terminating the cluster.

    • ReadonlyUNEXPECTED_POD_RECREATION: "UNEXPECTED_POD_RECREATION"

      ** Only relevant on k8s dataplanes (i.e. clusters launched with CMv2 - not CMv1). Unexpected new driver pod created

    • ReadonlyUNKNOWN: "UNKNOWN"

      Default when there is no termination code.

    • ReadonlyUNSUPPORTED_INSTANCE_TYPE: "UNSUPPORTED_INSTANCE_TYPE"

      Failure due to an instance being of an unsupported type. This is used when an instance in an EC2 fleet is of an unrecognized type, or an invalid type (i.e. graviton when we don't want graviton instances). This should be alerted on.

    • ReadonlyUPDATE_INSTANCE_PROFILE_FAILURE: "UPDATE_INSTANCE_PROFILE_FAILURE"

      Attach projects failure

    • ReadonlyUSAGE_POLICY_ENTITLEMENT_DENIED: "USAGE_POLICY_ENTITLEMENT_DENIED"

      cluster request is denied due to disallowed usage policy entitlement

    • ReadonlyUSER_INITIATED_VM_TERMINATION: "USER_INITIATED_VM_TERMINATION"

      User request for termination directly to cloud

    • ReadonlyUSER_REQUEST: "USER_REQUEST"

      A user terminated the cluster directly. Parameters should include a username field that indicates the specific user who terminated the cluster.

    • ReadonlyWORKER_SETUP_FAILURE: "WORKER_SETUP_FAILURE"

      Bootstrap failure due to error during worker setup, usually due to an issue with disk or gpu setup. See SetupCommandBuilder for other possible causes

    • ReadonlyWORKSPACE_CANCELLED_ERROR: "WORKSPACE_CANCELLED_ERROR"

      Workspace was cancelled hence deny/terminate the cluster

    • ReadonlyWORKSPACE_CONFIGURATION_ERROR: "WORKSPACE_CONFIGURATION_ERROR"

      Workspace configuration is in error state due to configuration issue or ACL modification by the customer side

    • ReadonlyWORKSPACE_UPDATE: "WORKSPACE_UPDATE"

      Worker environment version was changed due to workspace network or CMK update.