Quick Start

Scaffold the project with marivo init (see Installation), then author your declarations. A minimal Marivo project has a manifest, datasource declarations, and semantic declarations:

your-project/
  marivo.toml
  models/
    datasources/
      warehouse.py
    semantic/
      sales/
        _domain.py

Declare a datasource:

import marivo.datasource as md

md.duckdb(
    name="warehouse",
    path="warehouse.duckdb",
    ai_context={
        "business_definition": "Local DuckDB warehouse for sales analysis.",
        "guardrails": ["Use only for development or approved local analysis."],
    },
)

Declare semantic objects:

import marivo.datasource as md
import marivo.semantic as ms

ms.domain(
    name="sales",
    ai_context={
        "business_definition": "Sales order analysis domain.",
        "guardrails": ["Revenue metrics should only use completed orders."],
    },
)

orders = ms.entity(
    name="orders",
    datasource=md.ref("warehouse"),
    source=ms.table("orders"),
    primary_key=["order_id"],
    ai_context={
        "business_definition": "One row per sales order.",
        "guardrails": ["Exclude cancelled test data in metric definitions."],
    },
)


@ms.dimension(
    entity=orders,
    name="region",
    ai_context={
        "business_definition": "Sales region assigned to the order.",
        "guardrails": ["Do not treat missing region as a real region."],
    },
)
def region(orders):
    return orders.region


@ms.time_dimension(
    entity=orders,
    name="order_date",
    granularity="day",
    is_default=True,
    ai_context={
        "business_definition": "Calendar date when the order was placed.",
        "guardrails": ["Use this as the default time axis for order metrics."],
    },
)
def order_date(orders):
    return orders.order_date


@ms.metric(
    entities=[orders],
    additivity="additive",
    name="revenue",
    ai_context={
        "business_definition": "Total completed order amount.",
        "guardrails": ["Use only where order amount is already net of cancellations."],
    },
)
def revenue(orders):
    return orders.amount.sum()

You don’t have to write these by hand. marivo init installed the marivo-semantic skill into .claude/skills/ and .codex/skills/, so a coding agent (Claude Code or Codex) running in the project can build the semantic layer with you. Point it at your tables and ask:

Use the marivo-semantic skill to model the orders table in the sales domain.

The skill walks the agent through a disciplined authoring loop, one object at a time:

Follow the ladder — domain → entity → dimension → time dimension → measure → metric → relationship — so each object’s dependencies exist before it.
Plan, then write. For each object the agent calls a ms.prepare_* API (e.g. ms.prepare_metric) and branches on the returned brief before writing the declaration into models/semantic/<domain>/_domain.py.
Verify before advancing. After each object it runs ms.verify_object(ref) and does not move on while that fails.
Gate at closeout. It finishes with ms.readiness() and resolves blockers before handing the catalog to analysis.

The agent only asks you the decisions it cannot infer from the data or project docs (for example, whether an amount is already net of refunds). Everything it produces is the same Python declarations shown above — reviewable in git like any other code.

Discover the catalog before analysis:

import marivo.semantic as ms

catalog = ms.load()
catalog.list().show()

report = catalog.readiness()
if report.status == "blocked":
    report.show()

Run an analysis session after the catalog is ready:

import marivo.analysis as mv

session = mv.session.get_or_create(name="revenue-check", question="Why did Q4 drop?")
catalog = session.catalog
revenue = catalog.get("sales.revenue")
region = catalog.get("sales.orders.region")

current = session.observe(
    revenue,
    timescope={"start": "2026-10-01", "end": "2027-01-01"},
    grain="month",
    dimensions=[region],
)
baseline = session.observe(
    revenue,
    timescope={"start": "2025-10-01", "end": "2026-01-01"},
    grain="month",
    dimensions=[region],
)
delta = session.compare(current, baseline)
attribution = session.decompose(delta, axis=region)
attribution.show()

Best practices

The same project structure scales from a one-off script to a reviewed, shared analysis project. Three habits make the difference.

Build the semantic layer as a shared knowledge base

The semantic layer is not just plumbing to reach tables — it is the knowledge base an agent reads before it analyzes. Invest in ai_context on every object:

business_definition — what the metric or dimension means, in business terms.
guardrails — rules an agent must respect: required filters, exclusions, scope limits.
synonyms and examples — so an agent resolves a natural-language question to the right object instead of guessing.

A well-enriched object answers an agent’s “can I use this, and how?” without a human in the loop. Readiness enforces the floor: a missing business_definition blocks analysis, and missing guardrails raises a warning. See the Semantic Layer for the full ai_context contract.

Manage the project with git

A Marivo project is plain text: marivo.toml plus the Python files under models/. That makes the semantic layer a reviewable, shareable artifact — treat it like application code.

Version the contract. Commit marivo.toml and models/. Every change to a metric definition or guardrail shows up as a diff.
Review semantic changes like code. Land definition changes through pull requests so a domain owner approves what a metric means before agents use it.
Share through the repo. Anyone who clones the project — and any agent that runs in it — gets the same trusted catalog.
Keep state and secrets out of git. Add .marivo/ to .gitignore: it holds project-local session and evidence state, not the contract. Credentials are authored as *_env references and resolved from the environment (or cached in user-global ~/.marivo/secrets.toml) — they are never written into the project.

A typical .gitignore:

.marivo/

Gate before handoff

Run ms.readiness() after loading and resolve blockers before any analysis session. A project can load while readiness is still blocked, so never pass a blocked catalog to an agent. See Readiness.