Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
This feature is in Public Preview.
This page describes each component of the YAML used to define a metric view.
YAML overview
The YAML definition for a metric view includes six top-level fields:
version
: Defaults to0.1
. This is the version of the metric view specification.source
: The source data for the metric view. This can be a table-like asset or a SQL query.joins
: Optional. Star schema and snowflake schema joins are supported.filter
: Optional. A SQL boolean expression that applies to all queries; equivalent to theWHERE
clause.dimensions
: An array of dimension definitions, including the dimension name and expression.measures
: An array of aggregate expression columns.
YAML syntax and formatting
Metric view definitions follow standard YAML notation syntax. See yaml.org's documentation to learn more about YAML specifications.
Column name references
When referencing column names that contain spaces or special characters in YAML expressions, enclose the column name in backticks to escape the space or character. If the expression starts with a backtick and is used directly as a YAML value, wrap the entire expression in double quotes. Valid YAML values cannot start with a backtick.
Formatting examples
Use the following examples to learn how to format YAML correctly in common scenarios.
Reference a column name
The following table shows how to format column names depending on the characters they contain.
Case | Source column name(s) | Reference expression(s) | Notes |
---|---|---|---|
No spaces | revenue |
expr: "revenue" expr: 'revenue' expr: revenue |
Use double quotes, single quotes, or no quotes around the column name. |
With spaces | First Name |
expr: "`First Name`" |
Use backticks to escape spaces. Enclose the entire expression in double quotes. |
Column name with spaces in a SQL expression | First Name and Last Name |
expr: CONCAT(`First Name`, , `Last Name`) |
If the expression doesn't start with backticks, double quotes are not necessary. |
Quotes are included in the source column name | "name" |
expr: '`"name"`' |
Use backticks to escape the double-quotes in the column name. Enclose that expression in single quotes in the YAML definition. |
Use expressions with colons
Case | Expression | Notes |
---|---|---|
Expressions with colons | expr: "CASE WHEN Customer Tier = 'Enterprise: Premium' THEN 1 ELSE 0 END" |
Wrap the entire expression in double quotes for correct interpretation |
Note
YAML interprets unquoted colons as key-value separators. Always use double quotes around expressions that include colons.
Multi-line indentation
Case | Expression | Notes |
---|---|---|
Multi-line indentation | expr: > CASE WHEN revenue > 100 THEN 'High' ELSE 'Low' `` |
Indent the expression under the first line |
Note
Use the >
block scalar after expr:
for multi-line expressions. All lines must be indented at least two spaces beyond the expr
key for correct parsing.
Source
You can use a table-like asset or a SQL query as the source for your metric view. To use a table-like asset, you must have at least SELECT
privileges on the asset.
Use a table as a source
To use a table as a source, include the fully-qualified table name, as in the following example.
source: samples.tpch.orders
Use a SQL query as a source
To use a SQL query, write the query text directly in the YAML.
source: SELECT * FROM samples.tpch.orders o
LEFT JOIN samples.tpch.customer c
ON o.o_custkey = c.c_custkey
Note
When using a SQL query as a source with a JOIN
clause, Databricks recommends setting primary and foreign key constraints on underlying tables and using the RELY
option for optimal performance at query time, if applicable. For more information about using primary and foreign key constraints, see Declare primary key and foreign key relationships and Query optimization using primary key constraints.
Use metric view as a source
You can also use an existing metric view as the source for a new metric view:
version: 0.1
source: views.examples.source_metric_view
dimensions:
# Dimension referencing dimension from source_metric_view
- name: Order date
expr: order_date_dim
measures:
# Measure referencing dimension from source_metric_view
- name: Latest order month
expr: MAX(order_date_dim_month)
# Measure referencing measure from source_metric_view
- name: Latest order year
expr: DATE_TRUNC('year', MEASURE(max_order_date_measure))
When using a metric view as a source:
Dimensions in the new metric view can reference any dimension in the source metric view.
Measures in the new metric view can reference any dimension or measure in the source metric view.
All other composability rules apply. See Composability.
Filter
A filter in the YAML definition of a metric view applies to all queries that reference it. It must be written as a SQL boolean expression and is equivalent to using a WHERE
clause in a SQL query.
Joins
Joins define relationships between a metric view's source and other sources, such as tables, views, or other metric views. You can use joins to model relationships from the fact table to dimension tables (star schema) and to traverse from dimensions to sub-dimensions, allowing multi-hop joins across normalized dimension tables (snowflake schema). If you join to another metric view, only its dimensions are available in the downstream metric view.
Snowflake joins are supported only when using Databricks Runtime compute 17.1 and above. See Use joins in metric views.
Dimensions
Dimensions are used in SELECT
, WHERE
, and GROUP BY
clauses at query time. Each expression must return a scalar value. Each dimension consists of two components:
name
: The alias of the column.expr
: A SQL expression on the source data that defines the dimension.
The following example demonstrates how to define dimensions:
dimensions:
# Column name
- name: Order date
expr: o_orderdate
# SQL expression
- name: Order month
expr: DATE_TRUNC('MONTH', `Order date`)
# Referring to a column with a space in the name
- name: Month of order
expr: `Order month`
# Multi-line expression
- name: Order status
expr: CASE
WHEN o_orderstatus = 'O' THEN 'Open'
WHEN o_orderstatus = 'P' THEN 'Processing'
WHEN o_orderstatus = 'F' THEN 'Fulfilled'
END
Measures
Measures are an array of aggregate expressions that define aggregated results without a pre-determined level of aggregation. They must be expressed as aggregate functions. To reference a measure in a query, you must use the MEASURE
function. Each measure consists of the following components:
name
: The alias of the measure.expr
: An aggregate SQL expression that can include SQL aggregate functions.
See Aggregate functions for a list of aggregate functions.
See measure
aggregate function.
The following example demonstrates how to define measures:
measures:
# Basic aggregation
- name: Total revenue
expr: SUM(o_totalprice)
# Basic aggregation with ratio
- name: Total revenue per customer
expr: SUM(`Total revenue`) / COUNT(DISTINCT o_custkey)
# Measure-level filter
- name: Total revenue for open orders
expr: COUNT(o_totalprice) FILTER (WHERE o_orderstatus='O')
# Measure-level filter with multiple aggregate functions
# filter needs to be specified for each aggregate function in the expression
- name: Total revenue per customer for open orders
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus='O')/COUNT(DISTINCT o_custkey) FILTER (WHERE o_orderstatus='O')
Window measures
Important
This feature is Experimental.
Window measures enable you to define measures with windowed, cumulative, or semiadditive aggregations in your metric views. These types of measures allow for more complex calculations, such as moving averages, period-over-period changes, and running totals. See Use window measures in metric views for examples that demonstrate how to use window measures in metric views.
Composability
Metric views are composable, allowing you to build complex logic by referencing previously defined elements.
In a metric view definition:
- Dimensions can reference dimensions previously defined in the YAML.
- Measures can reference all dimensions.
- Measures can reference measures previously defined using the
MEASURE()
function.
The following example shows how dimensions and measures can be composed:
dimensions:
# Dimension referencing a source column
- name: Order month
expr: DATE_TRUNC('month', o_orderdate)
# Dimension referencing a previously defined dimension
- name: Previous order month
expr: ADD_MONTHS(`Order Month`, -1)
measures:
# Measure referencing a dimension
- name: Earliest order month
expr: MIN(`Order month`)
# Measure referencing a source column
- name: Revenue
expr: SUM(sales_amount)
# Measure referencing a source column
- name: Costs
expr: SUM(item_cost)
# Measure referencing previously defined measures
- name: Profit
expr: MEASURE(Revenue) - MEASURE(Costs)
Column name mapping in CREATE VIEW
with YAML
When you create a metric view using CREATE VIEW
with a column_list
, the system maps YAML-defined columns (measures and dimensions) to the column_list
by position, not by name.
This follows standard SQL behavior as shown in the following example:
CREATE VIEW v (col1, col2) AS SELECT a, b FROM table;
In this example, a
maps to col1
, and b
maps to col2
, regardless of their original names.