sail.voyage is a flight recorder for long-running agent, evaluation, and
background-task trajectories. Your harness owns the work loop; Sail records
timeline events, spans, agent metadata, and correlated
inference calls. For a guided introduction, see the
Voyages guide; this page is the API reference.
import sail
with sail.voyage.run(name="overnight-eval"):
with sail.voyage.agent("Solver", role="executor"):
with sail.voyage.span("call-model"):
sail.voyage.event("model.called", payload={"model": "zai-org/GLM-5"})
print(sail.voyage.id(), sail.voyage.dashboard_url())
Two ways to call
Every operation is available two ways:
- Module-level helpers (
sail.voyage.event(...), sail.voyage.span(...),
…) act on the current Voyage — the process-global Voyage set by the most
recent create()/attach().
- Methods on the
Voyage object returned by create()/attach() act on
that specific Voyage.
They behave identically; the module-level helpers just save you from threading
the Voyage object through your code.
The current Voyage is process-global, while span and agent contexts use Python
contextvars (so they follow lexical/async execution context). A raw
threading.Thread does not inherit the active span/agent, but can still use
the current Voyage for inference correlation.
sail.voyage.run
def run(
name: str,
*,
version: int | None = None,
metadata: dict | None = None,
sailbox_id: str | None = None,
) -> ContextManager[Voyage | NoopVoyage]
The recommended entry point: one Voyage around one block of work, with the
terminal lifecycle handled for you.
with sail.voyage.run("code-review", version=3) as voyage:
do_work()
Creates the Voyage on enter (same arguments and semantics as
create() — always creates, never reads
SAIL_VOYAGE_ID), emits voyage.completed on clean exit, and on an
exception emits voyage.failed with the exception’s type and message, then
re-raises. Terminal delivery is the same bounded best-effort flush as
complete()/fail(); call voyage.flush() inside the block when you need
raise-on-failure delivery confirmation. Without SAIL_API_KEY the block
runs with telemetry disabled. Controllers whose create and complete sites
live in different places keep using create()/attach() directly.
sail.voyage.create
def create(
name: str,
*,
version: int | None = None,
metadata: dict | None = None,
sailbox_id: str | None = None,
) -> Voyage | NoopVoyage
Creates a new Voyage, makes it the current Voyage for the process, and emits
voyage.started. Always creates — even when SAIL_VOYAGE_ID is set in the
environment; a child process joining its parent’s Voyage uses
attach() instead.
| Parameter | Default | Description |
|---|
name | — | Required. The series name the dashboard groups runs under. |
version | None | Optional positive integer; bump when the harness/prompts/model mix change. |
metadata | None | JSON object (≤ 64 KiB) attached to the Voyage. |
sailbox_id | None | Bind the Voyage to a Sailbox. Falls back to SAILBOX_ID. |
Returns a Voyage. When SAIL_API_KEY is absent it
returns a no-op Voyage instead (see below).
sail.voyage.attach
def attach(voyage_id: str | None = None) -> Voyage | NoopVoyage
Attaches to an existing Voyage and makes it the current Voyage for the
process. voyage_id defaults to SAIL_VOYAGE_ID — the handoff a parent
process sets so its children join the parent’s Voyage (see
child-process attach). Raises
ValueError when neither is provided and telemetry is enabled; without
SAIL_API_KEY it returns a no-op Voyage instead, so a keyless child keeps
running with telemetry disabled. Attaching does not emit a second
voyage.started.
No-op when unauthenticated
If SAIL_API_KEY is not set, create() and attach() return a NoopVoyage:
no Voyage is created, no network calls are made, and id() /
dashboard_url() return None. Every method is a safe no-op. This lets the
same script run locally without credentials. (Sailbox and inference APIs
still require SAIL_API_KEY.)
The Voyage object
create() and attach() return a Voyage with these attributes:
| Attribute | Type | Description |
|---|
id | str | Voyage id (None on a no-op Voyage). |
dashboard_url | str | None | Dashboard URL for the Voyage. |
status | str | None | Latest known terminal/lifecycle status. |
name | str | None | Voyage name. |
version | int | None | Voyage version. |
sailbox_id | str | None | Bound Sailbox, if any. |
metadata | dict | Metadata supplied at creation. |
It exposes the same operations as the module-level helpers below
(event, span, agent, error, complete, fail, flush, headers).
event
def event(
kind: str,
level: str = "info",
message: str | None = None,
payload: dict | None = None,
*,
span_id: str | None = None,
parent_span_id: str | None = None,
error_type: str | None = None,
occurred_at: str | None = None,
sequence_id: int | None = None,
) -> None
Records a timestamped event on the current span/agent. Agent attribution
comes from the enclosing agent() context, or from the
SAIL_AGENT_* env defaults when no context is active — there is no
per-event override; a one-shot attributed event is
with voyage.agent(...): voyage.event(...). Events are buffered locally and
flushed by a background thread; event() validates input, enqueues quickly,
and does not raise network errors.
| Parameter | Default | Description |
|---|
kind | required | Event kind/name (≤ 128 characters, non-empty). |
level | "info" | One of debug, info, warn, error. |
message | None | Human-readable message. Truncated at 4 KiB with a warning. |
payload | None | JSON object (≤ 64 KiB). |
span_id / parent_span_id | None | Override span placement; default from the active span. |
error_type | None | Error class name for error events. |
occurred_at | None | RFC3339 timestamp with timezone; defaults to now. |
sequence_id | None | Explicit monotonic ordering id; auto-assigned when omitted. |
span
def span(
span_name: str | None = None,
*,
message: str | None = None,
payload: dict | None = None,
span_id: str | None = None,
parent_span_id: str | None = None,
) -> ContextManager # also usable as a decorator
Usable as a context manager or a decorator — @sail.span() names the
span after the decorated function’s __qualname__; the with form requires
a name. The decorator resolves the current Voyage at call time, so
module-level decoration before create() attributes correctly. Decorating a
generator function raises TypeError (the context would close at generator
creation); async def is fully supported.
@sail.span()
def fetch_sources(urls):
...
Returns a context manager that emits span.started on enter and
span.completed (or span.failed, with the exception type) on exit. Spans
nest: a span opened inside another becomes its child automatically. A span
carries no agent identity of its own — wrap it in
agent() to attribute it and everything inside it to a named agent.
Span outcomes
The yielded span object accepts outcome data via merge_payload(): the
terminal event carries the started payload shallow-merged with everything
merged during the span (outcome keys win on conflict; repeated calls
accumulate). The span.started event is unchanged, and outcomes ride
span.failed too — partial results recorded before a crash are kept.
with voyage.span("score-subject", payload={"subject": "tennis"}) as s:
result = score(subject)
s.merge_payload({"score": result.score, "verdict": result.verdict})
with voyage.agent("Reviewer", role="reviewer"):
with voyage.span("draft-review"):
voyage.event("review.drafted")
agent
def agent(
name: str,
*,
role: str | None = None,
slug: str | None = None,
) -> ContextManager
Returns a context manager that marks all events and inference calls inside it
as belonging to a named agent. name (the only required argument) is the
display identity shown in the dashboard; the stable attribution key is derived
from it automatically (lowercased, ASCII, hyphenated). role= is an optional
cohort taxonomy used for cross-workflow filtering. slug= (advanced) pins the
attribution key explicitly — use it when renaming a display name should keep
one identity, or when a child process must attach as the same agent.
with voyage.agent("Reviewer", role="reviewer"):
...
agent() and span() double as decorators — the taught spelling for
function-shaped work (sail.agent and sail.span are top-level re-exports
of the same objects):
import sail
@sail.agent("Researcher", role="researcher")
@sail.span()
def collect_sources(urls):
return [fetch(url) for url in urls]
Each call enters a fresh agent/span frame (concurrent calls don’t share
state) and emits the same events as the with form.
error
def error(
error_type: str | None = None,
message: str | None = None,
payload: dict | None = None,
) -> None
Records an error-level voyage.error event without terminating the Voyage.
complete
def complete(message: str | None = None, payload: dict | None = None) -> None
Emits voyage.completed and performs a bounded (10s) best-effort flush.
Always call complete() (or fail()) before the controller process exits —
terminal status is first-terminal-wins, and events after a terminal event are
best-effort. On delivery failure it warns and returns (the event stays
buffered for background/atexit retry) instead of raising; call
flush() afterwards if you need raise-on-failure delivery
confirmation.
fail
def fail(
error_type: str = "harness_error",
message: str | None = None,
payload: dict | None = None,
) -> None
Emits voyage.failed and performs the same bounded best-effort flush as
complete() — it warns instead of raising on delivery failure. error_type
must be non-empty.
voyage = sail.voyage.create(name="task")
try:
do_work()
voyage.complete(message="ok")
except Exception as exc:
voyage.fail(error_type=exc.__class__.__name__, message=str(exc))
raise
flush
def flush(timeout: float | None = None) -> None
Blocks until all buffered events are delivered, raising on delivery failure. A
bounded best-effort flush also runs automatically at process exit, but that is
not a substitute for complete()/fail() for product-critical terminal state.
def headers(existing: Mapping[str, str] | None = None) -> dict[str, str]
Returns a copy of existing with the full attribution context set:
X-Sail-Voyage-Id for the current Voyage, plus X-Sail-Voyage-Span-Id and
X-Sail-Voyage-Agent-Id for the span/agent active at call time. Use this to
correlate a raw HTTP/OpenAI client with the Voyage when you can’t use the
inference wrappers. Compute it per request — never
once at client construction — so each call carries the context actually
active when it is made.
child_env
def child_env(*, agent: bool = True) -> dict[str, str]
Env vars a child process needs to attach() to the
current Voyage: merge into the child’s environment instead of exporting
SAIL_VOYAGE_ID by hand. With agent=True (default) the active agent()
context rides along as the child’s SAIL_AGENT_* defaults. Returns {}
when telemetry is disabled, so the handoff is keyless-safe.
subprocess.run(["python", "worker.py"], env={**os.environ, **sail.voyage.child_env()})
disable
def disable() -> NoopVoyage
Disables Voyage telemetry for this process by installing a no-op current
Voyage — the public form of the state create()/attach() enter when
SAIL_API_KEY is absent. For controllers that catch a startup telemetry
failure and choose to continue unobserved.
Module helpers
def id() -> str | None
def dashboard_url() -> str | None
sail.voyage.id() and sail.voyage.dashboard_url() return the current
Voyage’s id and dashboard URL (or None when there is no current Voyage or
it is a no-op).
Environment variables
| Variable | Purpose |
|---|
SAIL_API_KEY | Required for real Voyage events and inference. Use an org-bearing sk_... key. |
SAIL_MODE | Selects the SDK routing mode (prod default; dev, staging). |
SAIL_API_URL | Overrides the REST API URL directly. |
SAILBOX_ID | Default sailbox_id attached to new Voyages. |
SAIL_VOYAGE_ID | Default voyage id for attach(). |
SAIL_AGENT_ID | Default event agent_id; normalized to the derived slug form, so it matches agent() in a parent process. |
SAIL_AGENT_NAME | Default event agent_name. |
SAIL_AGENT_ROLE | Default event agent_role. |
SAIL_VOYAGE_DEBUG | Adds per-occurrence repeats of degradation warnings. One warning per category (no-op mode, dropped events, stubbed payloads, failed flushes) is always emitted. |
Validation and delivery semantics
- Validation:
kind ≤ 128 characters; level one of debug/info/warn/error;
payload and metadata must be JSON objects; occurred_at must be
RFC3339 with a timezone; version must be a positive integer. Oversized
human text does not raise: a message or span name over 4 KiB is
truncated, and a payload over 64 KiB is replaced by a
{"_truncated": true, "_original_bytes": N} stub — each with a warning.
metadata over 64 KiB raises at create() (startup is when failing is
cheapest). create() and attach() validate their arguments before the
no-key gate, so a malformed call raises even when telemetry is disabled.
- Buffering: events go to a bounded local buffer drained by a background
flusher (≈ 1s interval). When the buffer is full, the oldest non-terminal
events are dropped first and a
sdk.events_dropped notice is emitted;
terminal events (voyage.started/completed/failed) are preserved.
- Delivery semantics:
flush() blocks and raises delivery errors
(sail.VoyageError and subclasses) — the strict primitive. complete()
and fail() perform a bounded best-effort flush and warn instead of
raising. event() never raises network errors.