The Satsuma CLI is built for AI agents. 21 parser-backed commands let your agent slice workspaces, trace lineage, extract metadata, and pull NL intent — all from the parse tree, with 100% deterministic results.
Humans use it too — validate, fmt, and lint are your day-to-day workflow commands. But the CLI's real power is as the structural backbone your agent reasons on top of.
The CLI extracts facts. The agent interprets them. It also powers the VS Code extension's diagnostics, lineage views, and code intelligence.
# Teach your agent about Satsuma in one command $ satsuma agent-reference >> AGENTS.md # Agent can now query your workspace $ satsuma summary examples/sfdc-to-snowflake/pipeline.stm Workspace Summary examples/sfdc-to-snowflake/pipeline.stm Schemas: 8 Mappings: 5 Metrics: 2 Fragments: 1 Files scanned: 6 $ satsuma lineage --from loyalty_sfdc loyalty_sfdc -> sat_customer_demographics (loyalty to demographics) -> mart_customer_360 (demographics to mart)
satsuma agent-reference prints a compact prompt that teaches any AI agent the Satsuma grammar, the CLI commands, and the recommended workflow patterns. Append it to your agent's instructions file and you're done.
Append the reference to whatever file your agent reads at startup. Works with Claude Code, Copilot, Cursor, Windsurf, or any agent that reads a system prompt from a file.
# Claude Code $ satsuma agent-reference >> CLAUDE.md # GitHub Copilot $ satsuma agent-reference >> .github/copilot-instructions.md # Cursor $ satsuma agent-reference >> .cursor/rules/satsuma.mdc # Or paste into any conversation $ satsuma agent-reference | pbcopy
The reference covers:
.stm files
The CLI gives agents token-efficient structural queries instead of dumping entire files into context. Agents compose these primitives into higher-level workflows — the CLI extracts the facts, the agent reasons over them.
Your agent can trace any data element from source to destination across the entire workspace. It starts with lineage for schema-level paths, then drills into arrows for field-level detail.
When an arrow is classified as nl, the agent reads the natural-language intent and interprets it. When it's structural, the transform pipeline is fully specified — no interpretation needed.
lineage --from traces all downstream consumers of a schema
arrows --as-source follows each field through its transforms
nl extracts NL intent at [nl] hops for the agent to interpret
where-used finds every reference to a schema or fragment
# Schema-level: where does this data go? $ satsuma lineage --from loyalty_sfdc loyalty_sfdc -> sat_customer_demographics -> mart_customer_360 # Field-level: trace LoyaltyTier through transforms $ satsuma arrows loyalty_sfdc.LoyaltyTier --as-source --json { "target": "sat_customer_demographics.loyalty_tier", "classification": "structural", "transform": "UPPER | TRIM" } # Follow the chain — next hop is NL $ satsuma arrows sat_customer_demographics.loyalty_tier --as-source [nl] loyalty_tier -> mart_customer_360.tier_label "Map Gold/Silver/Bronze to internal codes" # Agent reads the NL and decides what to do
Satsuma uses natural-language strings for intent that can't be expressed as deterministic pipelines. The nl command extracts these verbatim — notes, transform descriptions, and comments — so your agent can analyze, critique, or summarize them.
Agents use this to review business logic, check if NL descriptions match the structural transforms around them, identify ambiguities, or generate documentation from the intent strings.
@ref references in NL are machine-extractable — the CLI traces them
# All NL content in a mapping $ satsuma nl 'demographics to mart' mart_customer_360.full_name [transform] "Concatenate first and last name from `@ref sat_customer_demographics`" mart_customer_360.tier_label [transform] "Map Gold/Silver/Bronze to internal codes per the tier mapping in `@ref lookup_tiers`" mart_customer_360 [note] "This mart combines demographic and loyalty data into a single customer view for BI" # Field-level NL only $ satsuma nl mart_customer_360.email "Hash with SHA-256 before loading into the mart. Original plaintext stays in the sat."
Satsuma's metadata system is open-ended — any token can be a tag. Your agent uses meta and find --tag to extract these, then combines them with your organisation's guidelines to drive code generation.
For example, if your team follows the Data Vault standard, your agent reads pk, bk, hash_diff tags from schema metadata and generates hub, satellite, and link DDL accordingly. If your team uses pii and encrypt tags, the agent knows to emit encryption logic.
The CLI doesn't know what hash_diff means — it just extracts the tag. Your agent, armed with your org's standards doc, interprets it.
# Read metadata on a target schema $ satsuma meta hub_customer hub_customer customer_hk (pk, hash_key) customer_bk (bk, required) load_date (required) record_source (required) # Find all PII fields across the workspace $ satsuma find --tag pii --json [ { "schema": "loyalty_sfdc", "field": "Email", "tags": ["pii", "encrypt"] }, { "schema": "loyalty_sfdc", "field": "SSN", "tags": ["pii", "encrypt", "mask"] } ] # Agent reads your org's Data Vault standard, # sees pk + hash_key, and generates: # CREATE TABLE hub_customer ( # customer_hk BINARY(32) NOT NULL, # customer_bk VARCHAR(255) NOT NULL, # ...
For complex analysis, graph --json exports the complete semantic graph in a single call — all nodes, edges, field-level data flow, and unresolved NL arrows. The agent loads it once and reasons offline, without round-trips.
--schema-only and --no-nl reduce payload for large workspaces
--namespace scopes the export to a single namespace
unresolved_nl section surfaces all NL arrows awaiting interpretation
$ satsuma graph examples/sfdc-to-snowflake/pipeline.stm --json { "nodes": [ { "name": "loyalty_sfdc", "type": "schema" }, { "name": "sat_customer_demographics", ... }, { "name": "mart_customer_360", ... } ], "schema_edges": [ { "from": "loyalty_sfdc", "to": "sat_customer_demographics", "role": "source" }, ... ], "edges": [ ... ], "unresolved_nl": [ ... ] } # Narrow scope for large workspaces $ satsuma graph examples/sfdc-to-snowflake/pipeline.stm --json --namespace warehouse
"What breaks if I change this field?"
$ satsuma arrows loyalty_sfdc.LoyaltyTier --as-source --json $ satsuma arrows sat_customer_demographics.loyalty_tier --as-source --json $ satsuma nl mart_customer_360.loyalty_tier
"Does PII survive through the pipeline unencrypted?"
$ satsuma find --tag pii --json $ satsuma arrows loyalty_sfdc.Email --as-source --json $ satsuma nl mart_customer_360.email
"Which target fields have no mapping?"
$ satsuma fields mart_customer_360 --unmapped-by 'demographics to mart' $ satsuma fields mart_customer_360 --unmapped-by 'online to mart'
"Match source to target and write the mapping"
$ satsuma match-fields --source loyalty_sfdc --target sat_customer $ satsuma nl sat_customer $ satsuma meta sat_customer.country_code
These three commands are the human side of the CLI. Run them before committing, in CI, or as editor commands. They're also used by the VS Code extension under the hood.
"Is my workspace well-formed?"
Checks parse errors, undefined schema references, missing fields, and invalid paths. Run it before every commit.
$ satsuma validate examples/sfdc-to-snowflake/pipeline.stm valid — 0 errors, 0 warnings $ satsuma validate --json { "valid": true, "errors": 0, "warnings": 0 }
"One canonical style, zero config."
Opinionated formatter backed by the tree-sitter CST. Semantics-preserving. Use --check in CI.
$ satsuma fmt examples/sfdc-to-snowflake/pipeline.stm Formatted 1 file(s) $ satsuma fmt --check examples/sfdc-to-snowflake/pipeline.stm 0 file(s) would be reformatted $ satsuma fmt --diff mapping.stm
"Does this follow best practices?"
Policy and convention checks with --fix for safe autofix. Catches hidden NL dependencies and duplicate definitions.
$ satsuma lint --fix Fixed 2 issue(s) $ satsuma lint --rules hidden-source-in-nl (fixable) unresolved-nl-ref duplicate-definition
Every arrow the CLI returns carries a classification derived from CST node types. This tells the agent whether it needs to interpret the transform or can trust the syntax.
| Classification | Meaning | Agent action |
|---|---|---|
structural |
Deterministic pipeline | None — fully specified |
nl |
Natural-language string | Read and interpret intent |
mixed |
Pipeline steps + NL strings | Review the NL portion |
none |
Bare src -> tgt |
None |
nl-derived |
Implicit from @ref |
Verify referenced field exists |
Download a prebuilt package from the v0.7.0 release and install globally with npm.
# Universal — works on macOS, Linux, and Windows (WASM-based) $ npm install -g https://github.com/thorbenlouw/satsuma-lang/releases/download/v0.7.0/satsuma-cli-v0.7.0.tgz $ satsuma --help satsuma <command> [options] 21 commands available
# Universal — works on macOS, Linux, and Windows (WASM-based) $ npm install -g https://github.com/thorbenlouw/satsuma-lang/releases/download/latest/satsuma-cli-latest.tgz
All commands accept --json for structured output and --help for usage details. Many support --compact for minimal output.
Complete reference for every command in the CLI. See the full reference on GitHub for detailed usage and examples.
Block-level extraction — retrieve whole blocks or workspace-level summaries.
Workspace overview — schemas, mappings, metrics, and counts.
Full schema definition from the parse tree.
Full definition of a schema decorated with metric metadata — grain, slice, filter, and measure fields.
Full mapping with all arrows and transforms.
Find all fields carrying a metadata tag (pii, encrypt, etc.).
Schema-level graph traversal, forward or backward.
All references to a schema, fragment, or transform.
All //! and //? comments across the workspace.
Keyword-ranked block extraction (heuristic fuzzy search).
Fine-grained extraction — slice below block level for arrows, NL, metadata, and fields.
All arrows for a field, with transform classification.
NL content — notes, transforms, comments — extracted verbatim.
Metadata entries — tags, constraints, annotations.
Field list with types. Supports --unmapped-by.
Normalized name comparison between source and target schemas.
Full semantic graph export in a single call.
Complete graph with nodes, edges, and field-level data flow.
Schema-level adjacency list (minimal payload).
Topology only, omit field-level edges.
Formatting, validation, linting, and structural comparison.
Opinionated, zero-config formatter. --check for CI.
Parse errors and semantic reference checks.
Policy checks with --fix autofix.
Structural comparison of two workspace snapshots.
Bootstrap your AI agent.
Print the AI Agent Reference for embedding in agent instructions.
Transform strings, notes, and comments are extracted verbatim. The CLI never assesses whether an NL transform is correct or complete.
There are no impact, coverage, or audit commands. These are agent workflows built from primitives.
The CLI is deterministic, fast, and reproducible. Same input, same output, every time.
Commands take explicit structural arguments. The agent decides which commands to call based on the user's question.
Install the CLI, run satsuma agent-reference >> AGENTS.md, and your agent can query your data mappings in seconds.