Databricks SDK for JavaScript
    Preparing search index...

    Configuration for the -managed ingestion pipeline. Groups the ingestion destination (required) and optional backfill source.

    interface IngestionConfig {
        backfillJobId?: bigint;
        backfillSource?: BackfillSource;
        deduplicationColumns?: string[];
        ingestionDestination?: IngestionDestination;
        ingestionJobId?: bigint;
        ingestionPipelineId?: string;
    }
    Index

    Properties

    backfillJobId?: bigint

    The ID of the Databricks Job that performs the historical backfill of the ingestion Delta table.

    backfillSource?: BackfillSource

    A user-provided source for backfilling data. Historical data is used when creating a training set from streaming features linked to this Stream. The backfill data stored in this location will be copied into the ingestion table for offline querying and training. The schema for this source must match exactly that of the key and payload schemas specified for this Stream.

    deduplicationColumns?: string[]

    Column paths used to identify duplicate rows during ingestion; only one row per distinct combination of these values is kept. Use dot notation for nested fields (e.g. value.user_id). Empty list means every column is compared.

    ingestionDestination?: IngestionDestination

    Destination for the -managed Delta table that holds an offline copy of the streaming data for querying and training. This table contains both 1) forward-filled data from the Stream and 2) backfilled data from the BackfillSource (if provided). This table is created and managed by and is deleted when the Stream is deleted.

    ingestionJobId?: bigint

    The ID of the Databricks Job that performs the forward-fill ingestion.

    ingestionPipelineId?: string

    The ID of the SDP pipeline that continuously copies new events from the streaming source into the ingestion Delta table.