Analysis Workflow
Every Marivo analysis starts from a metric and runs as a write-run-read loop: an agent writes an intent (an operator call plus its fields), runs it, and reads a typed result before deciding the next step.
The pieces:
- A session holds the guiding question, the semantic catalog, and the persisted results of every step.
- An intent is one operator call —
observe,compare,decompose, … — with its parameters. The parameters are the analysis specification. - A frame is the typed result of an intent. Frames are the boundary between steps: each operator consumes specific frame types and produces another.
import marivo.analysis as mvimport marivo.semantic as msOpen a session
Section titled “Open a session”session = mv.session.get_or_create(name="revenue-investigation", question="Why did Q4 drop?")catalog = session.catalogrevenue = catalog.get("sales.revenue") # a metric objectregion = catalog.get("sales.orders.region") # a dimension objectget_or_create is idempotent: it attaches to an existing session of that name or
creates one, and sets it current. The narrow session API is get_or_create,
current(), list(), and delete(name).
How an intent is specified
Section titled “How an intent is specified”Operators take catalog objects and a few shared value objects. You will reuse these across almost every intent:
| Value | Shape | Meaning |
|---|---|---|
| metric input | catalog.get("sales.revenue") | A catalog metric object or its SemanticRef subclass (e.g. MetricRef). Authoring refs from ms.aggregate(...) also work. Bare strings are rejected. |
| dimension input | catalog.get("sales.orders.region") | A catalog dimension object or its DimensionRef / TimeDimensionRef (see Semantic refs). Used for dimensions, where keys, and axis. |
timescope | {"start": "2026-10-01", "end": "2027-01-01"} | Half-open time range — start inclusive, end exclusive. |
grain | "day" | "week" | "month" | "quarter" | "year" | "hour" | … | Time bucket size. Present ⇒ time series or panel. |
dimensions | [region, country] | Segment axes. In v1 all must resolve to the metric’s entity. |
where | {region: "US"} or {amount: {"op": ">", "value": 100}} | Pre-aggregation row filter (see ops below). |
AlignmentPolicy | mv.window_bucket() | How two windows are paired for compare / correlate. |
The result’s semantic kind follows from grain and dimensions: scalar
(neither), time_series (grain only), segmented (dimensions only), or panel
(both).
where predicate operators
Section titled “where predicate operators”Keys are catalog dimensions; values are a scalar (==), a list (in), or a
structured {"op": ..., "value": ...} form:
| Form | Meaning |
|---|---|
"US" | == (equality) |
["US", "CA"] | in (membership) |
{"op": "!=", "value": "US"} | not equal |
{"op": ">", "value": 100} (>=, <, <= likewise) | numeric comparison |
{"op": "between", "value": ["2026-07-01", "2026-09-30"]} | inclusive range (exactly two values) |
Core operators
Section titled “Core operators”observe → MetricFrame
Section titled “observe → MetricFrame”The starting point for any analysis: materialize a metric over a time range and/or segments.
| Parameter | Type | Required | Default | Meaning |
|---|---|---|---|---|
metric | metric object / ref | Yes | — | The metric to materialize. |
timescope | dict | No | None | Half-open {"start", "end"} window. |
grain | grain | No | None | Time bucket; present ⇒ time series or panel. |
dimensions | list[ref] | No | None | Segment axes. |
where | dict | No | None | Pre-aggregation row filter. |
time_dimension | ref | No | entity default | Pick the time axis when the entity declares several. |
expect_shape | shape | No | None | Guard; raises before backend work if the predicted shape differs. |
current = session.observe( revenue, timescope={"start": "2026-10-01", "end": "2027-01-01"}, grain="month", dimensions=[region],)compare → DeltaFrame
Section titled “compare → DeltaFrame”Quantify change between two observe results (current minus baseline). The frames
must share metric and semantic kind.
| Parameter | Type | Required | Default | Meaning |
|---|---|---|---|---|
current | MetricFrame | Yes | — | Current-period frame. |
baseline | MetricFrame | Yes | — | Baseline-period frame. |
alignment | AlignmentPolicy | No | window_bucket | How buckets/segments are paired. |
baseline = session.observe( revenue, timescope={"start": "2025-10-01", "end": "2026-01-01"}, grain="month", dimensions=[region],)delta = session.compare(current, baseline)compare pairs buckets with window_bucket by default. Pass alignment= to
override — mv.dow_aligned(), mv.holiday_aligned(), or
mv.holiday_and_dow_aligned(); the calendar-backed kinds also take
calendar=mv.CalendarRef(...).
decompose → AttributionFrame
Section titled “decompose → AttributionFrame”Attribute a delta’s movement across one segment axis — why did it change?
| Parameter | Type | Required | Default | Meaning |
|---|---|---|---|---|
frame | DeltaFrame | Yes | — | The delta to explain. |
axis | dimension | Yes | — | The segment axis to attribute over. |
attribution = session.decompose(delta, axis=region)attribution.show()correlate → AssociationResult
Section titled “correlate → AssociationResult”Measure the association between two metrics over aligned buckets.
| Parameter | Type | Required | Default | Meaning |
|---|---|---|---|---|
a, b | MetricFrame | Yes | — | The two frames to associate. |
measure_a, measure_b | str | No | frame measure | Numeric column on each frame. |
alignment | AlignmentPolicy | No | window_bucket | Bucket pairing. |
method | "pearson" | No | "pearson" | Correlation method (v1: Pearson, zero lag). |
forecast → ForecastFrame
Section titled “forecast → ForecastFrame”Project a time series or panel forward.
| Parameter | Type | Required | Default | Meaning |
|---|---|---|---|---|
history | MetricFrame (time_series/panel) | Yes | — | Continuous history, no NaNs. |
horizon | int | Yes | — | Buckets to project (≥ 1). |
model | "naive" | "seasonal_naive" | "drift" | No | "seasonal_naive" | Forecast strategy. |
seasonality_period | int | No | by grain | Override the seasonal period (day=7, week=52, month=12, quarter=4). |
interval_level | float | No | 0.95 | Confidence level for the prediction interval. |
measure_column | str | No | frame measure | Column to forecast. |
history = session.observe(revenue, timescope={"start": "2026-01-01", "end": "2026-04-01"}, grain="day")projection = session.forecast(history, horizon=30)assess_quality → QualityReport
Section titled “assess_quality → QualityReport”Run quality checks (row counts, null ratios, time coverage, duplicate keys) over a
MetricFrame. Returns per-check rows, blocking issues, and recommended follow-ups.
| Parameter | Type | Required | Default | Meaning |
|---|---|---|---|---|
frame | MetricFrame | Yes | — | The frame to inspect. |
hypothesis_test → HypothesisTestResult
Section titled “hypothesis_test → HypothesisTestResult”Paired test of whether a metric’s mean changed between two periods.
| Parameter | Type | Required | Default | Meaning |
|---|---|---|---|---|
a, b | MetricFrame | Yes | — | Current and baseline frames. |
hypothesis | "mean_changed" | No | "mean_changed" | Test type (v1). |
value_a, value_b | str | No | frame measure | Numeric column on each frame. |
alignment | AlignmentPolicy | No | window_bucket | Pairing for the test. |
sampling | SamplingPolicy | No | inferred | Pairing/min-sample rules. |
alpha | float | No | 0.05 | Significance level. |
Discovery — session.discover.* → CandidateSet
Section titled “Discovery — session.discover.* → CandidateSet”Discovery operators search a frame for noteworthy items and return a ranked
CandidateSet. Pass value="<column>" to disambiguate when a frame has several
numeric columns; threshold is the cutoff (lower ⇒ more candidates).
| Helper | Source shape | Required | Key options |
|---|---|---|---|
point_anomalies | MetricFrame time_series/panel | — | value, threshold=3.0 |
period_shifts | DeltaFrame time_series/panel | ≥ 4 buckets | value, threshold=2.0 |
driver_axes | DeltaFrame | search_space | value, limit |
interesting_slices | MetricFrame or DeltaFrame | — | search_space, value, threshold=2.0, limit |
interesting_windows | time_series/panel frame | — | value, threshold=2.0 |
cross_sectional_outliers | MetricFrame segmented/panel | — | peer_scope, value, threshold=3.0 |
series = session.observe(revenue, timescope={"start": "2026-01-01", "end": "2026-04-01"}, grain="day")candidates = session.discover.point_anomalies(series, threshold=2.0)candidates.show()Transforms — session.transform.*
Section titled “Transforms — session.transform.*”Transforms reshape a frame while preserving its family (MetricFrame →
MetricFrame, DeltaFrame → DeltaFrame).
| Transform | Key parameters | Effect |
|---|---|---|
filter | predicate (callable) | Keep rows where the predicate returns true. |
slice | where (axis → value/list/range) | Keep rows matching exact axis values. |
rollup | drop_axes | Drop axes and re-aggregate measures. |
topk | by, limit, order | Keep the top N rows by a measure (order="decrease" default). |
bottomk | by, limit | Keep the bottom N rows. |
rank | by, method, rank_column | Add a rank column ordered by a measure. |
normalize | mode, baseline | index / share / pct_change / per_unit / z_score (MetricFrame only). |
window | window | Restrict to a time window. |
Escape hatch and promotion
Section titled “Escape hatch and promotion”When a step needs something the built-in intents do not model, drop to scratch frames — then promote back into the typed flow before continuing.
session.explore_ibis(builder, datasource=...)— run a custom ibis query →ExplorationResult.session.from_pandas(df)— import external data →ExplorationResult.session.promote_metric_frame(...)/promote_delta_frame(...)/promote_attribution_frame(...)— upgrade a scratch frame into a typed frame. Promotion never infers metadata; you supplymetric,semantic_kind,measure_column, etc. (or aPromotionPolicywithsemantic_anchors), and it fails closed when anything required is missing.
Evidence and knowledge
Section titled “Evidence and knowledge”Every operator records evidence into the session, so conclusions stay auditable.
session.knowledge()— established facts, driver facts, open anomalies, and suggested follow-ups for the whole session.session.evidence.findings(...),.propositions(...),.assessments(...),.proposition(id),.latest_assessment(id),.trace(id)— look up the evidence objects, and trace a proposition back to the findings that support it.
See Evidence for the full model.
End-to-end
Section titled “End-to-end”import marivo.analysis as mv
session = mv.session.get_or_create(name="revenue-check", question="Why did Q4 drop?")catalog = session.catalogrevenue = catalog.get("sales.revenue")region = catalog.get("sales.orders.region")
current = session.observe( revenue, timescope={"start": "2026-10-01", "end": "2027-01-01"}, grain="month", dimensions=[region],)baseline = session.observe( revenue, timescope={"start": "2025-10-01", "end": "2026-01-01"}, grain="month", dimensions=[region],)delta = session.compare(current, baseline)attribution = session.decompose(delta, axis=region)attribution.show()From the delta you can branch: session.discover.period_shifts(delta) to find
when it moved, or session.forecast(current, horizon=3) to project it forward.
Frame types
Section titled “Frame types”| Frame | Produced by |
|---|---|
MetricFrame | observe (and promote_metric_frame) |
DeltaFrame | compare |
AttributionFrame | decompose |
AssociationResult | correlate |
ForecastFrame | forecast |
QualityReport | assess_quality |
HypothesisTestResult | hypothesis_test |
CandidateSet | discover.* |
ExplorationResult | from_pandas, explore_ibis |