MetricFlow to KTX Semantic Layer
A MetricFlow semantic_model maps to an SL source; MetricFlow measures map to KTX measures; MetricFlow entities map to KTX joins; MetricFlow metrics (top-level) map to KTX measures OR to cross-model derived measures. Files in one WorkUnit are ALWAYS part of the same logical entity (a connected component, possibly spanning extends: + cross-model metric refs). Flatten inheritance and cross-file references at write time.
Mapping table
| MetricFlow | KTX form | Notes |
|---|---|---|
semantic_model: X { model: ref('t') } with measures + dimensions | Overlay at <connId>/X.yaml with measures, computed-only columns, column_overrides, joins | The model: ref resolves to a manifest table. |
semantic_model: X { model: source('s','t') } | Overlay at <connId>/X.yaml over table t. | Same shape; source() still resolves to a physical table. |
semantic_model: X { model: <literal> } with no manifest entry | Standalone with explicit sql:, grain:, columns: | Happens when the dbt manifest isn't available. |
semantic_model: Y { extends: X } | Merge Y's measures/dimensions/entities into X's overlay, or write a single overlay named for the most-derived child (Y) containing both X's and Y's primitives | Do not emit a second overlay for X - flatten. |
measures: [{ name, agg, expr }] | measures: [{ name, expr: "<agg>(<expr>)" }] | Aggregation inlined. agg: count_distinct → count(distinct ...). |
entities: [{ name, type: primary }] | grain: [<entity_name-or-expr>] on the overlay/standalone | Primary/unique entities drive grain. |
entities: [{ name, type: foreign }] | joins: entry joining to the primary-entity's semantic_model | Only when a matching primary is discoverable. |
metrics: [{ type: simple, type_params: { measure: X } }] | If the base measure is labeled/described by the metric: in-place edit to the existing measure. Otherwise leave as-is. | Same-name metrics can absorb metadata. |
metrics: [{ type: simple, filter: <jinja> }] | New measure on the same source, with the filter translated to SQL and attached via filter: | Translate Jinja {{ Dimension('x__y') }} to the column name y. |
metrics: [{ type: derived, type_params: { expr, metrics } }] | Derived measure on whichever source owns the referenced measures, with expr: referencing measure names | If the metric spans models, still write it once on the source owning the "primary" measure (the one the agent judges most central). Mention the cross-model chain in the description. |
metrics: [{ type: ratio, type_params: { numerator, denominator } }] | Same as derived; expr: "numerator / NULLIF(denominator, 0)" if no explicit expr | Safe-division by default. |
metrics: [{ type: cumulative, type_params: { window, grain_to_date } }] | Standalone source with a window-function SQL; reference the resulting column as a normal measure | KTX SL has no first-class cumulative primitive (spec Non-goals). |
metrics: [{ type: conversion }] | Flag for human - do NOT write. Emit a wiki note describing the intended semantics. | No KTX equivalent in v1. |
| Metric not mappable | Wiki page <metric_name>-definition.md with the full YAML body quoted | Capture the intent even if we can't emit SL. |
Type map: MetricFlow time to KTX time; categorical to string; number to number; boolean to boolean. Follow expr over name when both differ - expr is the physical column.
Verify each MetricFlow model source table with entity_details before producing the corresponding sl_write_source.
Identifier Verification Protocol
Before writing a wiki page or SL source on any topic:
discover_data({query: "<topic>"})- see what wikis, SL sources, and raw tables already exist. Prefer updating existing pages over creating new ones.
Before emitting any schema.table or schema.table.column into a wiki body,
SL source, tables: frontmatter, sl_refs, or emit_unmapped_fallback:
entity_details({connectionId, targets: [{display: "<identifier>"}]})- confirm the identifier resolves; inspect native types, FK/PK, and sampleValues.- For literal values from the source, such as status codes or plan tiers,
check whether they appear in
entity_detailssampleValues for the relevant column. If sampleValues is short or the sample may have missed real values, run asql_executionprobe with the same warehouse connection id:sql_execution({connectionId, sql: "SELECT DISTINCT <col> FROM <ref> LIMIT 50"}). - If the candidate identifier still does not resolve, do one of:
- Use
sql_execution({connectionId, sql: "SELECT 1 FROM <ref> LIMIT 0"}). If it errors, the identifier is fictional. - Wrap the identifier in
[unverified - from <rawPath>]in the wiki body, citing the exact raw path that mentioned it. - When recording
emit_unmapped_fallbackwithno_physical_table, include the failing probe error inclarification.
- Use
- Never copy
<schema>.<table>placeholder strings from these instructions into output.
Flattening extends:
Within one WorkUnit, multiple semantic_models linked by extends: are guaranteed to be present (the chunker groups them). Resolve inheritance before writing:
- Start with the most-derived child (the one that no other semantic_model extends).
- Walk the
extends:chain upward, accumulating measures, dimensions, entities. - Write ONE overlay/standalone, named for the most-derived child's SL-appropriate name (not the base).
- Parents that lack their own distinctive content should NOT get a separate overlay. If a parent has unique measures a child doesn't inherit, consider whether the base is used elsewhere - if yes, write both; if no, still one overlay.
- Measure/dimension name collisions: child wins, but note the overridden parent in the overlay's description or in a sibling wiki page.
The spec's worked example has orders, orders_ext (extends orders), and metrics/orders_final.yml (defines revenue referencing both). The right output is ONE overlay named orders_ext (or orders if the team's naming favors the base) containing order_count, gross_amount, refund_amount, and a derived revenue measure. Provenance tags point to all three source files.
model: ref resolution
The model: field on a semantic_model is a string like ref('table_name'), source('src','table_name'), or a literal. Resolve:
ref('x')→ table namex. Verify viasl_discover(x).source('s','t')→ table namet. Verify viasl_discover(t).- Literal (no
ref(...)/source(...)) → treat as the table name directly.
If sl_discover errors because no such table exists, use discover_data and
entity_details to find the warehouse target. If a SQL probe is still needed,
call sql_execution with the same warehouse connection id, for example:
sql_execution({connectionId: "warehouse", sql: "SELECT 1 FROM analytics.orders LIMIT 0"}).
Never invent column names - every column in computed columns:, column_overrides:, grain:, and
sql: must be sourced from raw files, entity_details, or a successful SQL
probe.
After every sl_write_source, call sl_validate. The warehouse will reject invented columns with Unrecognized name: <name> - treat as a hard failure and re-read the schema.
Cumulative metrics - sql-standalone fallback
KTX SL has no first-class window: or grain_to_date: primitive in v1 (spec Non-goals). Translate a MetricFlow cumulative metric to a standalone SL source with a window-function SQL:
# MetricFlow input:
metrics:
- name: cum_revenue_7d
type: cumulative
type_params:
measure: gross_amount
window: 7 days
# KTX standalone output:
name: cum_revenue_7d
source_type: sql
sql: |
SELECT
ordered_at,
SUM(amount) OVER (ORDER BY ordered_at RANGE BETWEEN INTERVAL '7' DAY PRECEDING AND CU