Lakeflow Declarative Pipelines event log schema

2025-08-04

The Lakeflow Declarative Pipelines event log contains all information related to a pipeline, including audit logs, data quality checks, pipeline progress, and data lineage.

The following tables describe the event log schema. Some of these fields contain JSON data that require parsing to perform some queries, such as the details field. Azure Databricks supports the : operator to parse JSON fields. See : (colon sign) operator.

Note

Some fields in the event log are for internal use by Azure Databricks. The following documentation describes the fields that are intended for customer consumption.

For details about using the Lakeflow Declarative Pipelines event log, see Lakeflow Declarative Pipelines event log.

PipelineEvent object

Represents a single pipeline event in the event log.

Field	Description
`id`	A unique identifier for the event log record.
`sequence`	A JSON string containing metadata to identify and order events.
`origin`	A JSON string containing metadata for the origin of the event, for example, the cloud provider, the cloud provider region, user, and pipeline information. See Origin object.
`timestamp`	The time the event was recorded, in UTC.
`message`	A human-readable message describing the event.
`level`	The warning level. The possible values are: `INFO`: Informational events `WARN`: Unexpected, but non-critical issues `ERROR`: Event failure that might need user attention `METRICS`: Used for high-volume events stored only in the Delta table, and not shown in the pipelines UI.
`maturity_level`	The stability of the event schema. The possible values are: `STABLE`: The schema is stable and will not change. `NULL`: The schema is stable and will not change. The value might be `NULL` if the record was created before the `maturity_level` field was added (release 2022.37). `EVOLVING`: The schema is not stable and might change. `DEPRECATED`: The schema is deprecated and the Lakeflow Declarative Pipelines runtime might stop producing this event at any time. It is not recommended to build monitoring or alerts based on `EVOLVING` or `DEPRECATED` fields.
`error`	If an error occurred, details describing the error.
`details`	A JSON string containing structured details of the event. This is the primary field used for analyzing events. The JSON string format depends on the `event_type`. See The details object for more information.
`event_type`	The event type. For a list of event types, and what details object type they create, see The details object.

The details object

Each event has different details properties in the JSON object, based on the event_type of the event. This table lists the event_type, and the associated details. The details properties are described in the Details types section.

Details type by `event_type`	Description
`create_update`	Captures the complete configuration that is used to start a pipeline update. Includes any configuration set by Databricks. For details, see Details for create_update.
`user_action`	Provides details on any user action on the pipeline (including creating a pipeline, as well as starting or canceling an update). For details, see Details for user_action event.
`flow_progress`	Describes the lifecycle of a flow from starting, running, to completed or failed. For details, see Details for flow_progress event.
`update_progress`	Describes the lifecycle of a pipeline update from starting, running, to completed, or failed. For details, see Details for update_progress event.
`flow_definition`	Defines the schema and query plan for any transformations occurring in a given flow. Can be thought of as the edges of the Dataflow DAG. It can be used to calculate the lineage for each flow as well as to see the explained query plan. For details, see Details for flow_definition event.
`dataset_definition`	Defines a dataset, which is either the source or the destination for a given flow. For details, see Details for dataset_definition event.
`sink_definition`	Defines a given sink. For details see Details for sink_definition event.
`deprecation`	Lists features that are soon to be or currently deprecated that this pipeline uses. For examples of the values, see Details enum for deprecation event.
`cluster_resources`	Includes information about cluster resources for pipelines that are running on classic compute. These metrics are only populated for classic compute pipelines. For details, see Details for cluster_resources event.
`autoscale`	Includes information about autoscaling for pipelines that are running on classic compute. These metrics are only populated for classic compute pipelines. For details, see Details for autoscale event.
`planning_information`	Represents planning information related to materialized view incremental vs. full refresh. Can be used to get more details on why a materialized view is fully recomputed. For details, see Details for planning_information event.
`hook_progress`	An event to indicate the current status of a user hook during the pipeline run. Used for monitoring the status of event hooks, for example, to send to external observability products. For details, see Details for hook_progress event.
`operation_progress`	Includes information about the progress of an operation. For details, see Details for operation_progress event.

Details types

The following objects represent the details of a different event type in the PipelineEvent object.

Details for create_update

The details for the create_update event.

Field	Description
`dbr_version`	The version of the Databricks Runtime.
`run_as`	The user ID that the update will run on behalf of. Typically this is either the owner of the pipeline or a service principal.
`cause`	The reason for the update. Typically either `JOB_TASK` if run from a job, or `USER_ACTION` when run interactively by a user.

Details for user_action event

The details for the user_action event. Includes the following fields:

Field	Description
`user_name`	The name of the user that triggered a pipeline update.
`user_id`	The ID of the user that triggered a pipeline update. This is not always the same as the `run_as` user, which could be a service principal or other user.
`action`	The action the user took, including `START` and `CREATE`.

Details for flow_progress event

The details for a flow_progress event.

Field	Description
`status`	The new status of the flow. Can be one of: `QUEUED` `STARTING` `RUNNING` `COMPLETED` `FAILED` `SKIPPED` `STOPPED` `IDLE` `EXCLUDED`
`metrics`	Metrics about the flow. For details, see FlowMetrics.
`data_quality`	Data quality metrics about the flow and associated expectations. For details, see DataQualityMetrics.

Details for update_progress event

The details for an update_progress event.

Field	Description
`state`	The new state of the update. Can be one of: `QUEUED` `CREATED` `WAITING_FOR_RESOURCES` `INITIALIZING` `RESETTING` `SETTING_UP_TABLES` `RUNNING` `STOPPING` `COMPLETED` `FAILED` `CANCELED` Useful for calculating the duration of various stages of a pipeline update from total duration to time spent waiting for resources, for example.
`cancellation_cause`	The reason why an update entered the `CANCELED` state. Includes reasons such as `USER_ACTION` or `WORKFLOW_CANCELLATION` (the workflow that triggered the update was canceled).

Details for flow_definition event

The details for a flow_definition event.

Field	Description
`input_datasets`	The inputs read by this flow.
`output_dataset`	The output dataset this flow writes to.
`output_sink`	The output sink this flow writes to.
`explain_text`	The explained query plan.
`schema_json`	Spark SQL JSON schema string.
`schema`	Schema of this flow.
`flow_type`	The type of flow. Can be one of: `COMPLETE`: Streaming table writes to its destination in complete (streaming) mode. `CHANGE`: Streaming table using `APPLY_CHANGES_INTO`. `SNAPSHOT_CHANGE`: Streaming table using `APPLY CHANGES INTO ... FROM SNAPSHOT ...`. `APPEND`: Streaming table writes to its destination in append (streaming) mode. `MATERIALIZED_VIEW`: Outputs to a materialized view. `VIEW`: Outputs to a view.
`comment`	User comment or description about the dataset.
`spark_conf`	Spark confs set on this flow.
`language`	The language used to create this flow. Can be `SCALA`, `PYTHON`, or `SQL`.
`once`	Whether this flow was declared to run once.

Details for dataset_definition event

The details for a dataset_definition event. Includes the following fields:

Field	Description
`dataset_type`	Differentiates between materialized views and streaming tables.
`num_flows`	The number of flows writing to the dataset.
`expectations`	The expectations associated with the dataset.

Details for sink_definition event

The details for a sink_definition event.

Field	Description
`format`	The format of the sink.
`options`	The key-value options associated with the sink.

Details enum for deprecation event

The deprecation event has a message field. The possible values for the message include the following. This is a partial list that grows over time.

Field	Description
`TABLE_MANAGED_BY_MULTIPLE_PIPELINES`	A table is managed by multiple pipelines.
`INVALID_CLUSTER_LABELS`	Using cluster labels that are not supported.
`PINNED_DBR_VERSION`	Using `dbr_version` instead of `channel` in pipeline settings.
`PREVIOUS_CHANNEL_USED`	Using the release channel `PREVIOUS`, which might go away in a future release.
`LONG_DATASET_NAME`	Using a data set name longer than the supported length.
`LONG_SINK_NAME`	Using a sink name longer than the supported length.
`LONG_FLOW_NAME`	Using a flow name longer than the supported length.
`ENHANCED_AUTOSCALING_POLICY_COMPLIANCE`	Cluster policy only complies when Enhanced Autoscaling uses fixed cluster size.
`DATA_SAMPLE_CONFIGURATION_KEY`	Using the configuration key to configure data sampling is deprecated.
`INCOMPATIBLE_CLUSTER_SETTINGS`	Current cluster settings or cluster policy are no longer compatible with Lakeflow Declarative Pipelines.
`STREAMING_READER_OPTIONS_DROPPED`	Using streaming reader options that are dropped.
`DISALLOWED_SERVERLESS_STATIC_SPARK_CONFIG`	Setting static Spark configs through pipeline configuration for serverless pipelines is not allowed.
`INVALID_SERVERLESS_PIPELINE_CONFIG`	Serverless customer provides invalid pipeline configuration.
`UNUSED_EXPLICIT_PATH_ON_UC_MANAGED_TABLE`	Specifying unused explicit table paths on UC managed tables.
`FOREACH_BATCH_FUNCTION_NOT_SERIALIZABLE`	The provided foreachBatch function is not serializable.
`DROP_PARTITION_COLS_NO_PARTITIONING`	Dropping the partition_cols attribute results in no partitioning.
`PYTHON_CREATE_TABLE`	Using @dlt.create\_table instead of @dlt.table.
`PYTHON_CREATE_VIEW`	Using @dlt.create\_view instead of @dlt.view.
`PYTHON_CREATE_STREAMING_LIVE_TABLE`	Using `create_streaming_live_table` instead of `create_streaming_table`.
`PYTHON_CREATE_TARGET_TABLE`	Using `create_target_table` instead of `create_streaming_table`.
`FOREIGN_KEY_TABLE_CONSTRAINT_CYCLE`	Set of tables managed by pipeline has a cycle in the set of foreign key constraints.
`PARTIALLY_QUALIFIED_TABLE_REFERENCE_INCOMPATIBLE_WITH_DEFAULT_PUBLISHING_MODE`	A partially qualified table reference that has different meanings in default publishing mode and legacy publishing mode.

Details for cluster_resources event

The details for a cluster_resources event. Only applicable for pipelines running on classic compute.

Field	Description
`task_slot_metrics`	The task slot metrics of the cluster. For details, see TaskSlotMetrics object
`autoscale_info`	The state of autoscalers. For details, see AutoscaleInfo object

Details for autoscale event

The details for an autoscale event. Autoscale events are only applicable when the pipeline uses classic compute.

Field	Description
`status`	Status of this event. Can be one of: `SUCCEEDED` `RESIZING` `FAILED` `PARTIALLY_SUCCEEDED`
`optimal_num_executors`	The optimal number of executors suggested by the algorithm before applying `min_workers` and `max_workers` bounds.
`requested_num_executors`	The number of executors after truncating the optimal number of executors suggested by the algorithm to `min_workers` and `max_workers` bounds.

Details for planning_information event

The details for a planning_information event. Useful for seeing details related to the chosen refresh type for a given flow during an update. Can be used to help debug why an update is fully refreshed rather than incrementally refreshed. For more details on incremental refreshes, see Incremental refresh for materialized views

Field	Description
`technique_information`	Refresh-related information. It includes both information on what refresh methodology was chosen and the possible refresh methodologies that were considered. Useful for debugging why a materialized view failed to incrementalize. For more details, see TechniqueInformation.
`source_table_information`	Source table information. Can be useful for debugging why a materialized view failed to incrementalize. For details, see TableInformation object.
`target_table_information`	Target table information. For details, see TableInformation object.

Details for hook_progress event

The details of a hook_progress event. Includes the following fields:

Field	Description
`name`	The name of the user hook.
`status`	The status of the user hook.

Details for operation_progress event

The details of an operation_progress event. Includes the following fields:

Field	Description
`type`	The type of operation being tracked. One of: `AUTO_LOADER_LISTING` `AUTO_LOADER_BACKFILL` `CONNECTOR_FETCH` `CDC_SNAPSHOT`
`status`	The status of the operation. One of: `STARTED` `COMPLETED` `CANCELED` `FAILED` `IN_PROGRESS`
`duration_ms`	The total elapsed time of the operation in milliseconds. Only included in the end event (where status is `COMPLETED`, `CANCELED`, or `FAILED`).

Other objects

The following objects represent additional data or enums within the event objects.

AutoscaleInfo object

The autoscale metrics for a cluster. Only applicable for pipelines running on classic compute.

Field	Description
`state`	The Autoscaling status. Can be one of: `SUCCEEDED` `RESIZING` `FAILED` `PARTIALLY_SUCCEEDED`
`optimal_num_executors`	The optimal number of executors. This is the optimal size suggested by the algorithm before being truncated by the user-specified min/max number of executors.
`latest_requested_num_executors`	The number of executors requested from the cluster manager by the state manager in the latest request. This is the number of executors the state manager is trying to scale to, and is updated when the state manager attempts to exit the scaling state in the event of timeouts. This field is not populated if there is no pending request.
`request_pending_seconds`	The length of time the scaling request has been pending. This is not populated if there is no pending request.

CostModelRejectionSubType object

An enum of reasons that incrementalization is rejected, based on cost of full refresh versus incremental refresh in a planning_information event.

Value	Description
`NUM_JOINS_THRESHOLD_EXCEEDED`	Fully refresh because the query contains too many joins.
`CHANGESET_SIZE_THRESHOLD_EXCEEDED`	Fully refresh because too many rows in the base tables changed.
`TABLE_SIZE_THRESHOLD_EXCEEDED`	Fully refresh because the base table size exceeded the threshold.
`EXCESSIVE_OPERATOR_NESTING`	Fully refresh because the query definition is complex and has many levels of operator nesting.
`COST_MODEL_REJECTION_SUB_TYPE_UNSPECIFIED`	Fully refresh for any other reason.

DataQualityMetrics object

Metrics about how expectations are being met within the flow. Used in a flow_progress event details.

Field	Description
`dropped_records`	The number of records that were dropped because they failed one or more expectations.
`expectations`	Metrics for expectations added to any dataset in the flow's query plan. When there are multiple expectations, this can be used to track which expectations were met or failed. For details, see ExpectationMetrics object.

ExpectationMetrics object

Metrics about expectations, for a specific expectation.

Field	Description
`name`	The name of the expectation.
`dataset`	The name of the dataset to which the expectation was added.
`passed_records`	The number of records that pass the expectation.
`failed_records`	The number of records that fail the expectation. Tracks whether the expectation was met, but does not describe what happens to the records (warn, fail, or drop the records).

FlowMetrics object

Metrics about the flow, including both total for the flow, and broken out by specific source. Used in a flow_progress event details.

Each streaming source supports only specific flow metrics. The following table shows the metrics available for supported streaming sources:

source	backlog bytes	backlog records	backlog seconds	backlog files
Kafka	✓	✓
Kinesis	✓		✓
Delta	✓			✓
Auto Loader	✓			✓
Google Pub/Sub	✓	✓

Field	Description
`num_output_rows`	Number of output rows written by an update of this flow.
`backlog_bytes`	Total backlog as bytes across all input sources in the flow.
`backlog_records`	Total backlog records across all input sources in the flow.
`backlog_files`	Total backlog files across all input sources in the flow.
`backlog_seconds`	Maximum backlog seconds across all input sources in the flow.
`executor_time_ms`	Sum of all task execution times in milliseconds of this flow over the reporting period.
`executor_cpu_time_ms`	Sum of all task execution CPU times in milliseconds of this flow over the reporting period.
`num_upserted_rows`	Number of output rows upserted into the dataset by an update of this flow.
`num_deleted_rows`	Number of existing output rows deleted from the dataset by an update of this flow.
`num_output_bytes`	Number of output bytes written by an update of this flow.
`source_metrics`	Metrics for each input source in the flow. Useful for monitoring ingestion progress from sources outside Lakeflow Declarative Pipelines (like Apache Kafka, Pulsar, or Auto Loader). Includes the fields: `source_name`: The name of the source. `backlog_bytes`: Backlog as bytes for this source. `backlog_records`: Backlog records for this source. `backlog_files`: Backlog files for this source. `backlog_seconds`: Backlog seconds for this source.

IncrementalizationIssue object

Represents issues with incrementalization that could cause a full refresh when planning an update.

Field	Description
`issue_type`	An issue type that could prevent the materialized view from incrementalizing. For details, see Issue Type.
`prevent_incrementalization`	Whether this issue prevented the incrementalization from happening.
`table_information`	Table information associated with issues like `CDF_UNAVAILABLE`, `INPUT_NOT_IN_DELTA`, `DATA_FILE_MISSING`.
`operator_name`	Plan-related information. Set for issues when the issue type is either `PLAN_NOT_DETERMINISTIC` or `PLAN_NOT_INCREMENTALIZABLE` to the operator or expression that causes the non-determinism or non-incrementalizability.
`expression_name`	The expression name.
`join_type`	Auxiliary information when the operator is a join. For example, `JOIN_TYPE_LEFT_OUTER` or `JOIN_TYPE_INNER`.
`plan_not_incrementalizable_sub_type`	Detailed category when the issue type is `PLAN_NOT_INCREMENTALIZABLE`. For details, see PlanNotIncrementalizableSubType object.
`plan_not_deterministic_sub_type`	Detailed category when the issue type is `PLAN_NOT_DETERMINISTIC`. For details, see PlanNotDeterministicSubType object.
`fingerprint_diff_before`	The diff from the fingerprint before.
`fingerprint_diff_current`	The diff from the current fingerprint.
`cost_model_rejection_subtype`	Detailed category when the issue type is `INCREMENTAL_PLAN_REJECTED_BY_COST_MODEL`. For details, see CostModelRejectionSubType object.

IssueType object

An enum of issue types that could cause a full refresh.

Value	Description
`CDF_UNAVAILABLE`	CDF (Change Data Feed) is not enabled on some base tables. The `table_information` field gives information on which table does not have CDF enabled. Use `ALTER TABLE <table-name> SET TBLPROPERTIES ( 'delta.enableChangeDataFeed' = true)` to enable CDF for the base table. If source table is a materialized view, CDF should be set to `ON` by default.
`DELTA_PROTOCOL_CHANGED`	Fully refresh because some base tables (details in the `table_information` field) had a Delta protocol change.
`DATA_SCHEMA_CHANGED`	Fully refresh because some base tables (details in the `table_information` field) had a data schema change in the columns used by the materialized view definition. Not relevant if a column that the materialized view does not use has been changed or added to the base table.
`PARTITION_SCHEMA_CHANGED`	Fully refresh because some base tables (details in the `table_information` field) had a partition schema change.
`INPUT_NOT_IN_DELTA`	Fully refresh because the materialized view definition involves some non-Delta input.
`DATA_FILE_MISSING`	Fully refresh because some base table files are already vacuumed due to their retention period.
`PLAN_NOT_DETERMINISTIC`	Fully refresh because some operators or expressions in the materialized view definition are not deterministic. The `operator_name` and `expression_name` fields give information on which operator or expression caused the issue.
`PLAN_NOT_INCREMENTALIZABLE`	Fully refresh because some operators or expressions in the materialized view definition are not incrementalizable.
`SERIALIZATION_VERSION_CHANGED`	Fully refresh because there was a significant change in the query fingerprinting logic.
`QUERY_FINGERPRINT_CHANGED`	Fully refresh because the materialized view definition changed, or Lakeflow Declarative Pipelines releases caused a change in the query evaluation plans.
`CONFIGURATION_CHANGED`	Fully refresh because key configurations (for example, `spark.sql.ansi.enabled`) that might affect query evaluation have changed. Full recompute is required to avoid inconsistent states in the materialized view.
`CHANGE_SET_MISSING`	Fully refresh because it is the first compute of the materialized view. This is expected behavior for initial materialized view computation.
`EXPECTATIONS_NOT_SUPPORTED`	Fully refresh because the materialized view definition includes expectations, which are not supported for incremental updates. Remove expectations or handle them outside of the materialized view definition if incremental support is needed.
`TOO_MANY_FILE_ACTIONS`	Fully refresh because the number of file actions exceeded the threshold for incremental processing. Consider reducing file churn in base tables or increasing thresholds.
`INCREMENTAL_PLAN_REJECTED_BY_COST_MODEL`	Fully refresh because the cost model determined that a full refresh is more efficient than incremental maintenance. Review the cost model behavior or complexity of the query plan to allow incremental updates.
`ROW_TRACKING_NOT_ENABLED`	Fully refresh because row tracking is not enabled on one or more base tables. Enable row tracking using `ALTER TABLE <table-name> SET TBLPROPERTIES ('delta.enableRowTracking' = true)`.
`TOO_MANY_PARTITIONS_CHANGED`	Fully refresh because too many partitions changed in the base tables. Try to limit the number of partition changes to stay within incremental processing limits.
`MAP_TYPE_NOT_SUPPORTED`	Fully refresh because the materialized view definition includes a map type, which is not supported for incremental updates. Consider restructuring the data to avoid map types in the materialized view.
`TIME_ZONE_CHANGED`	Fully refresh because the session or system time zone setting changed.
`DATA_HAS_CHANGED`	Fully refresh because the data relevant to the materialized view changed in a way that prevents incremental updates. Evaluate the data changes and structure of the view definition to ensure compatibility with incremental logic.
`PRIOR_TIMESTAMP_MISSING`	Fully refresh because the timestamp of the last successful run is missing. This can occur after metadata loss or manual intervention.

MaintenanceType object

An enum of maintenance types that might be chosen during a planning_information event. If the type is not MAINTENANCE_TYPE_COMPLETE_RECOMPUTE or MAINTENANCE_TYPE_NO_OP, the type is an incremental refresh.

Value	Description
`MAINTENANCE_TYPE_COMPLETE_RECOMPUTE`	Full recompute; always shown.
`MAINTENANCE_TYPE_NO_OP`	When base tables do not change.
`MAINTENANCE_TYPE_PARTITION_OVERWRITE`	Incrementally refresh affected partitions when the materialized view is co-partitioned with one of the source tables.
`MAINTENANCE_TYPE_ROW_BASED`	Incrementally refresh by creating modular changesets for various operations, such as `JOIN`, `FILTER`, and `UNION ALL,` and composing them to calculate complex queries. Used when Row tracking for the source tables is enabled, and there is a limited number of joins for the query.
`MAINTENANCE_TYPE_APPEND_ONLY`	Incrementally refresh by only computing new rows because there were no upserts or deletes in the source tables.
`MAINTENANCE_TYPE_GROUP_AGGREGATE`	Incrementally refresh by calculating changes for each aggregate value. Used when associative aggregates, such as `count`, `sum`, `mean`, and `stddev`, are at the topmost level of the query.
`MAINTENANCE_TYPE_GENERIC_AGGREGATE`	Incrementally refresh by calculating only the affected aggregate groups. Used when aggregates like `median` (not just associative ones) are at the topmost level of the query.
`MAINTENANCE_TYPE_WINDOW_FUNCTION`	Incrementally refresh queries with window functions like `PARTITION BY` by recomputing only the changed partitions. Used when all of the window functions have a `PARTITION BY` or `JOIN` clause and are at the topmost level of the query.

Origin object

Where the event originated.

Field	Description
`cloud`	The cloud provider. The possible values are: AWS Azure GCP
`region`	The cloud region.
`org_id`	The org id or workspace ID of the user. Unique within a cloud. Useful to identify the workspace, or to join with other tables, such as system billing tables.
`pipeline_id`	The id of the pipeline. A unique identifier for the pipeline. Useful to identify the pipeline, or to join with other tables, such as system billing tables.
`pipeline_type`	The type of the pipeline to show where the pipeline was created. The possible values are: `DBSQL`: A pipeline created via Databricks SQL. `WORKSPACE`: An ETL pipeline created via Lakeflow Declarative Pipelines. `MANAGED_INGESTION`: A Lakeflow Connect managed ingestion pipeline. `BRICKSTORE`: A pipeline to update an online table for real-time feature serving. `BRICKINDEX`: A pipeline to update a vector database. For more details, see vector search.
`pipeline_name`	The name of the pipeline.
`cluster_id`	The id of the cluster where an execution happens. Globally unique.
`update_id`	The id of a single execution of the pipeline. This is equivalent to run ID.
`table_name`	The name of the (Delta) table being written to.
`dataset_name`	The fully qualified name of a dataset.
`sink_name`	The name of a sink.
`flow_id`	The id of the flow. It tracks the state of the flow being used across multiple updates. As long as the `flow_id` is the same, the flow is incrementally refreshing. The `flow_id` changes when the materialized view full refreshes, the checkpoint resets, or a full recomputation occurs within the materialized view.
`flow_name`	The name of the flow.
`batch_id`	The id of a microbatch. Unique within a flow.
`request_id`	The id of the request that caused an update.

PlanNotDeterministicSubType object

An enum of non-deterministic cases for a planning_information event.

Value	Description
`STREAMING_SOURCE`	Fully refresh because the materialized view definition includes a streaming source, which is not supported.
`USER_DEFINED_FUNCTION`	Fully refresh because the materialized view includes an unsupported user-defined function. Only deterministic Python UDFs are supported. Other UDFs might prevent incremental updates.
`TIME_FUNCTION`	Fully refresh because the materialized view includes a time-based function such as `CURRENT_DATE` or `CURRENT_TIMESTAMP`. The `expression_name` property provides the name of the unsupported function.
`NON_DETERMINISTIC_EXPRESSION`	Fully refresh because the query includes a non-deterministic expression such as `RANDOM()`. The `expression_name` property indicates the non-deterministic function that prevents incremental maintenance.

PlanNotIncrementalizableSubType object

An enum of reasons an update plan might not be incrementalizable.

Value	Description
`OPERATOR_NOT_SUPPORTED`	Fully refresh because the query plan includes an unsupported operator. The `operator_name` property provides the name of the unsupported operator.
`AGGREGATE_NOT_TOP_NODE`	Fully refresh because an aggregate (`GROUP BY`) operator is not at the top level of the query plan. Incremental maintenance supports aggregates only at the top level. Consider defining two materialized views to separate the aggregation.
`AGGREGATE_WITH_DISTINCT`	Fully refresh because the aggregation includes a `DISTINCT` clause, which is not supported for incremental updates.
`AGGREGATE_WITH_UNSUPPORTED_EXPRESSION`	Fully refresh because the aggregation includes unsupported expressions. The `expression_name` property indicates the problematic expression.
`SUBQUERY_EXPRESSION`	Fully refresh because the materialized view definition includes a subquery expression, which is not supported.
`WINDOW_FUNCTION_NOT_TOP_LEVEL`	Fully refresh because a window function is not at the top level of the query plan.
`WINDOW_FUNCTION_WITHOUT_PARTITION_BY`	Fully refresh because a window function is defined without a `PARTITION BY` clause.

TableInformation object

Represents details of a table considered during a planning_information event.

Field	Description
`table_name`	Table name used in the query from Unity Catalog or Hive metastore. Might not be available in case of path-based access.
`table_id`	Required. Table ID from the Delta log.
`catalog_table_type`	Type of the table as specified in the catalog.
`partition_columns`	Partition columns of the table.
`table_change_type`	Change type in the table. One of: `TABLE_CHANGE_TYPE_UNKNOWN`, `TABLE_CHANGE_TYPE_APPEND_ONLY`, `TABLE_CHANGE_TYPE_GENERAL_CHANGE`.
`full_size`	The full size of the table in number of bytes.
`change_size`	Size of the changed rows in changed files. It is calculated using `change_file_read_size * num_changed_rows / num_rows_in_changed_files`.
`num_changed_partitions`	Number of changed partitions.
`is_size_after_pruning`	Whether `full_size` and `change_size` represent data after static file pruning.
`is_row_id_enabled`	Whether row ID is enabled on the table.
`is_cdf_enabled`	Whether CDF is enabled on the table.
`is_deletion_vector_enabled`	Whether deletion vector is enabled on the table.
`is_change_from_legacy_cdf`	Whether the table change is from legacy CDF or row-ID-based CDF.

TaskSlotMetrics object

The task slot metrics for a cluster. Only applies to pipeline updates running on classic compute.

Field	Description
`summary_duration_ms`	The duration in milliseconds over which aggregate metrics (for example, `avg_num_task_slots`) are calculated.
`num_task_slots`	The number of Spark task slots at the reporting instant.
`avg_num_task_slots`	The average number of Spark task slots over summary duration.
`avg_task_slot_utilization`	The average task slot utilization (number of active tasks divided by number of task slots) over summary duration.
`num_executors`	The number of Spark executors at the reporting instant.
`avg_num_queued_tasks`	The average task queue size (number of total tasks minus number of active tasks) over summary duration.

TechniqueInformation object

Refresh methodology information for a planning event.

Field	Description
`maintenance_type`	Maintenance type related to this piece of information. If the type is not `MAINTENANCE_TYPE_COMPLETE_RECOMPUTE` or `MAINTENANCE_TYPE_NO_OP`, the flow incrementally refreshed. For details, see MaintenanceType object.
`is_chosen`	True for the technique that was chosen for the refresh.
`is_applicable`	Whether the maintenance type is applicable.
`incrementalization_issues`	Incrementalization issues that might cause an update to fully refresh. For details, see IncrementalizationIssue object.
`change_set_information`	Information about the final produced change set. Values are one of: `CHANGE_SET_TYPE_APPEND_ONLY` `CHANGE_SET_TYPE_GENERAL_ROW_CHANGE`