Share via


Metric view YAML reference

Important

This feature is in Public Preview.

This page describes each component of the YAML used to define a metric view.

YAML overview

The YAML definition for a metric view includes six top-level fields:

  • version: Defaults to 0.1. This is the version of the metric view specification.
  • source: The source data for the metric view. This can be a table-like asset or a SQL query.
  • joins: Optional. Star schema and snowflake schema joins are supported.
  • filter: Optional. A SQL boolean expression that applies to all queries; equivalent to the WHERE clause.
  • dimensions: An array of dimension definitions, including the dimension name and expression.
  • measures: An array of aggregate expression columns.

YAML syntax and formatting

Metric view definitions follow standard YAML notation syntax. See yaml.org's documentation to learn more about YAML specifications.

Column name references

When referencing column names that contain spaces or special characters in YAML expressions, enclose the column name in backticks to escape the space or character. If the expression starts with a backtick and is used directly as a YAML value, wrap the entire expression in double quotes. Valid YAML values cannot start with a backtick.

Formatting examples

Use the following examples to learn how to format YAML correctly in common scenarios.

Reference a column name

The following table shows how to format column names depending on the characters they contain.

Case Source column name(s) Reference expression(s) Notes
No spaces revenue expr: "revenue"
expr: 'revenue'
expr: revenue
Use double quotes, single quotes, or no quotes around the column name.
With spaces First Name expr: "`First Name`" Use backticks to escape spaces. Enclose the entire expression in double quotes.
Column name with spaces in a SQL expression First Name and Last Name expr: CONCAT(`First Name`, , `Last Name`) If the expression doesn't start with backticks, double quotes are not necessary.
Quotes are included in the source column name "name" expr: '`"name"`' Use backticks to escape the double-quotes in the column name. Enclose that expression in single quotes in the YAML definition.

Use expressions with colons

Case Expression Notes
Expressions with colons expr: "CASE WHEN Customer Tier = 'Enterprise: Premium' THEN 1 ELSE 0 END" Wrap the entire expression in double quotes for correct interpretation

Note

YAML interprets unquoted colons as key-value separators. Always use double quotes around expressions that include colons.

Multi-line indentation

Case Expression Notes
Multi-line indentation expr: >
CASE WHEN
revenue > 100 THEN 'High'
ELSE 'Low'
``
Indent the expression under the first line

Note

Use the > block scalar after expr: for multi-line expressions. All lines must be indented at least two spaces beyond the expr key for correct parsing.

Source

You can use a table-like asset or a SQL query as the source for your metric view. To use a table-like asset, you must have at least SELECT privileges on the asset.

Use a table as a source

To use a table as a source, include the fully-qualified table name, as in the following example.

source: samples.tpch.orders

Use a SQL query as a source

To use a SQL query, write the query text directly in the YAML.

source: SELECT * FROM samples.tpch.orders o
  LEFT JOIN samples.tpch.customer c
  ON o.o_custkey = c.c_custkey

Note

When using a SQL query as a source with a JOIN clause, Databricks recommends setting primary and foreign key constraints on underlying tables and using the RELY option for optimal performance at query time, if applicable. For more information about using primary and foreign key constraints, see Declare primary key and foreign key relationships and Query optimization using primary key constraints.

Use metric view as a source

You can also use an existing metric view as the source for a new metric view:

version: 0.1
source: views.examples.source_metric_view

dimensions:

  # Dimension referencing dimension from source_metric_view
  - name: Order date
    expr: order_date_dim

measures:

  # Measure referencing dimension from source_metric_view
  - name: Latest order month
    expr: MAX(order_date_dim_month)

  # Measure referencing measure from source_metric_view
  - name: Latest order year
    expr: DATE_TRUNC('year', MEASURE(max_order_date_measure))

When using a metric view as a source:

  • Dimensions in the new metric view can reference any dimension in the source metric view.

  • Measures in the new metric view can reference any dimension or measure in the source metric view.

All other composability rules apply. See Composability.

Filter

A filter in the YAML definition of a metric view applies to all queries that reference it. It must be written as a SQL boolean expression and is equivalent to using a WHERE clause in a SQL query.

Joins

Joins define relationships between a metric view's source and other sources, such as tables, views, or other metric views. You can use joins to model relationships from the fact table to dimension tables (star schema) and to traverse from dimensions to sub-dimensions, allowing multi-hop joins across normalized dimension tables (snowflake schema). If you join to another metric view, only its dimensions are available in the downstream metric view.

Snowflake joins are supported only when using Databricks Runtime compute 17.1 and above. See Use joins in metric views.

Dimensions

Dimensions are used in SELECT, WHERE, and GROUP BY clauses at query time. Each expression must return a scalar value. Each dimension consists of two components:

  • name: The alias of the column.

  • expr: A SQL expression on the source data that defines the dimension.

The following example demonstrates how to define dimensions:

dimensions:

  # Column name
  - name: Order date
    expr: o_orderdate

  # SQL expression
  - name: Order month
    expr: DATE_TRUNC('MONTH', `Order date`)

  # Referring to a column with a space in the name
  - name: Month of order
    expr: `Order month`

  # Multi-line expression
  - name: Order status
    expr: CASE
            WHEN o_orderstatus = 'O' THEN 'Open'
            WHEN o_orderstatus = 'P' THEN 'Processing'
            WHEN o_orderstatus = 'F' THEN 'Fulfilled'
          END

Measures

Measures are an array of aggregate expressions that define aggregated results without a pre-determined level of aggregation. They must be expressed as aggregate functions. To reference a measure in a query, you must use the MEASURE function. Each measure consists of the following components:

  • name: The alias of the measure.

  • expr: An aggregate SQL expression that can include SQL aggregate functions.

See Aggregate functions for a list of aggregate functions.

See measure aggregate function.

The following example demonstrates how to define measures:

measures:

  # Basic aggregation
  - name: Total revenue
    expr: SUM(o_totalprice)

  # Basic aggregation with ratio
  - name: Total revenue per customer
    expr: SUM(`Total revenue`) / COUNT(DISTINCT o_custkey)

  # Measure-level filter
  - name: Total revenue for open orders
    expr: COUNT(o_totalprice) FILTER (WHERE o_orderstatus='O')

  # Measure-level filter with multiple aggregate functions
  # filter needs to be specified for each aggregate function in the expression
  - name: Total revenue per customer for open orders
    expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus='O')/COUNT(DISTINCT o_custkey) FILTER (WHERE o_orderstatus='O')

Window measures

Important

This feature is Experimental.

Window measures enable you to define measures with windowed, cumulative, or semiadditive aggregations in your metric views. These types of measures allow for more complex calculations, such as moving averages, period-over-period changes, and running totals. See Use window measures in metric views for examples that demonstrate how to use window measures in metric views.

Composability

Metric views are composable, allowing you to build complex logic by referencing previously defined elements.

In a metric view definition:

  • Dimensions can reference dimensions previously defined in the YAML.
  • Measures can reference all dimensions.
  • Measures can reference measures previously defined using the MEASURE() function.

The following example shows how dimensions and measures can be composed:

dimensions:

  # Dimension referencing a source column
  - name: Order month
    expr: DATE_TRUNC('month', o_orderdate)

  # Dimension referencing a previously defined dimension
  - name: Previous order month
    expr: ADD_MONTHS(`Order Month`, -1)

measures:

  # Measure referencing a dimension
  - name: Earliest order month
    expr: MIN(`Order month`)

  # Measure referencing a source column
  - name: Revenue
    expr: SUM(sales_amount)

  # Measure referencing a source column
  - name: Costs
    expr: SUM(item_cost)

  # Measure referencing previously defined measures
  - name: Profit
    expr: MEASURE(Revenue) - MEASURE(Costs)

Column name mapping in CREATE VIEW with YAML

When you create a metric view using CREATE VIEW with a column_list, the system maps YAML-defined columns (measures and dimensions) to the column_list by position, not by name.

This follows standard SQL behavior as shown in the following example:

CREATE VIEW v (col1, col2) AS SELECT a, b FROM table;

In this example, a maps to col1, and b maps to col2, regardless of their original names.