Metabase to KTX Semantic Layer
Each WorkUnit represents one Metabase collection's cards for one Metabase database (mapped to exactly one KTX connection). Every cards/<id>.json file carries the resolved SQL, result_metadata, card type, collection path, and referenced-card ids. The WU's sync-config.json tells you which sync mode is active and which selections apply. databases/<id>.json tells you the target KTX connection.
Context format
Each card JSON looks like:
{
"metabaseId": 7,
"name": "Daily orders",
"description": "Orders by day",
"type": "model",
"databaseId": 42,
"collectionId": 5,
"resolvedSql": "SELECT ...",
"templateTags": [{"name": "ref", "type": "card", "cardReference": 10}],
"resultMetadata": [
{"name": "day", "base_type": "type/DateTime", "semantic_type": "type/CreationTimestamp"},
{"name": "order_count", "base_type": "type/Integer"}
],
"collectionPath": ["Data", "Orders Team"],
"referencedCardIds": [10]
}
Use resultMetadata to:
- Map
base_typeto KSL column type:type/Integer,type/Float,type/Decimal,type/BigInteger→number;type/Text,type/TextLike→string;type/DateTime,type/Date,type/DateTimeWithTZ→time;type/Boolean→boolean. - Identify grain candidates: columns with
semantic_type: type/PK. - Identify join candidates: columns with
semantic_type: type/FKplusfk_target_field_id. - Identify time columns:
semantic_type: type/CreationTimestamportype/UpdatedTimestamp→ setrole: time. - Use
display_namefor measure descriptions when available.
Additional card metadata
parameters: list of card-level parameters with widget types and defaults. When SQL resolution fell back to unresolved SQL, use this to drive Step A of the SQL-translation workflow (drop optional clauses): knowing each{{ var }}istype: "date/range"vstype: "category"tells you what kind of clause it is.resultMetadata[i].field_ref: Metabase's canonical reference to the source warehouse field. Shape["field", <field_id>, <options>]. When this is set, the column maps directly to a warehouse field, which is useful for declaring joins from FK metadata without re-parsing SQL.lastRunAt: ISO timestamp of the card's last execution. If null or very old, the card may be dead; prefer skipping over creating a source.dashboardCount: number of dashboards referencing the card. Cards withdashboardCount: 0and a stalelastRunAtare strong skip signals.
Before writing a wiki page derived from a Metabase question SQL, verify each schema.table.column mentioned with entity_details.
Identifier Verification Protocol
Before writing a wiki page or SL source on any topic:
discover_data({query: "<topic>"})- see what wikis, SL sources, and raw tables already exist. Prefer updating existing pages over creating new ones.
Before emitting any schema.table or schema.table.column into a wiki body,
SL source, tables: frontmatter, sl_refs, or emit_unmapped_fallback:
entity_details({connectionId, targets: [{display: "<identifier>"}]})- confirm the identifier resolves; inspect native types, FK/PK, and sampleValues.- For literal values from the source, such as status codes or plan tiers,
check whether they appear in
entity_detailssampleValues for the relevant column. If sampleValues is short or the sample may have missed real values, run asql_executionprobe with the same warehouse connection id:sql_execution({connectionId, sql: "SELECT DISTINCT <col> FROM <ref> LIMIT 50"}). - If the candidate identifier still does not resolve, do one of:
- Use
sql_execution({connectionId, sql: "SELECT 1 FROM <ref> LIMIT 0"}). If it errors, the identifier is fictional. - Wrap the identifier in
[unverified - from <rawPath>]in the wiki body, citing the exact raw path that mentioned it. - When recording
emit_unmapped_fallbackwithno_physical_table, include the failing probe error inclarification.
- Use
- Never copy
<schema>.<table>placeholder strings from these instructions into output.
Decision tree
For each card:
- Analyze
resolvedSql+resultMetadata: identify base tables, aggregations, joins, filters, column types. - REQUIRED before any write: call
sl_discoverfor every candidate target source name. The response tells you whether the name is manifest-backed (Type: tableorType: sql). For manifest-backed names you MUST use the overlay shape (name:plus overlay fields such asmeasures:,segments:,descriptions:,joins:,disable_joins:,column_overrides:, and computed-onlycolumns:entries withexpr+type; nosql:,table:,grain:, or base-tablecolumns:); the tool will reject a standalone write and you'll have wasted the call. Ifsl_discoverreturns nothing for the name, you can write a standalone source. Also callsl_read_sourceon existing sources you intend to extend so you don't duplicate measures. - Include
rawPaths: ["cards/<id>.json"]on everysl_write_source,sl_edit_source, andwiki_writecall. If one artifact generalizes multiple near-duplicate cards, include each contributing card path and no unrelated cards. - Decide:
- Simple aggregation on a table that already has a source →
sl_edit_sourceto add a measure. - Join between tables that should be linked in the SL graph →
sl_edit_sourceto add a join. - Complex derived SQL (CTEs, multi-layer aggregation, scoring models) →
sl_write_sourcewithsource_type: sql. When the SQL projects/filters from a single manifest-backed base table, setinherits_columns_from: <manifest_key>so columns inherit type and description from the manifest - seesl_captureskill for the slim form. Usesl_discoverto discover the manifest key from the table reference in the SQL (it acceptsMARTS.CONSIGNMENTS,ANALYTICS.MARTS.CONSIGNMENTS, orCONSIGNMENTS). - New base table not yet in the semantic layer →
sl_write_sourcewithsource_type: table. - Trivial query (
SELECT *, simpleCOUNT(*)with no business logic) → do nothing; the runner will record this card asaction_type='skipped'. - Duplicate of an existing measure → same as trivial; do nothing for this card.
- Simple aggregation on a table that already has a source →
Manifest-only names need an overlay first. If sl_discover shows a source name with Type: table but sl_read_source returns "Source not found", the source lives only in the schema manifest (no standalone overlay yet). sl_edit_source cannot edit manifest-only names, and a full standalone sl_write_source for that name would shadow manifest columns and joins. Bootstrap an overlay with sl_write_source using the overlay shape:
name: <SOURCE_NAME>
measures:
- name: <measure_name>
expr: "<expression>"
Overlay shape: name: plus any of measures:, segments:, descriptions:, joins:, disable_joins:, exclude_columns:, column_overrides:, or computed-only columns: entries with expr + type. Never include sql:, table:, grain:, or base-table columns: on a manifest-backed name — those would shadow the manifest's schema and drop its joins. Use column_overrides: for inherited column descriptions. Overlay joins: are merged additively with the manifest's joins (deduped by to + on); use disable_joins: ["<on-clause>"] to suppress a specific manifest join. After the overlay exists, use sl_edit_source for further tweaks. See sl_capture skill for the canonical overlay rule.
Join discovery: When your card's SQL references warehouse tables (e.g. in FROM or JOIN clauses), call sl_discover({ query: '<table>' }) before writing. The matching manifest entry's name is the value you use in joins: [- to: <name>] only when the card output exposes a local key that matches the target source grain (for example account_id = mart_account_segments.account_id). Do not declare a KTX join just because the card SQL joins that table i