Skip to main content
sail.voyage is a flight recorder for long-running agent, evaluation, and background-task trajectories. Your harness owns the work loop; Sail records timeline events, spans, agent metadata, and correlated inference calls. For a guided introduction, see the Voyages guide; this page is the API reference.
import sail

with sail.voyage.run(name="overnight-eval"):
    with sail.voyage.agent("Solver", role="executor"):
        with sail.voyage.span("call-model"):
            sail.voyage.event("model.called", payload={"model": "zai-org/GLM-5"})

print(sail.voyage.id(), sail.voyage.dashboard_url())

Two ways to call

Every operation is available two ways:
  • Module-level helpers (sail.voyage.event(...), sail.voyage.span(...), …) act on the current Voyage — the process-global Voyage set by the most recent create()/attach().
  • Methods on the Voyage object returned by create()/attach() act on that specific Voyage.
They behave identically; the module-level helpers just save you from threading the Voyage object through your code.
The current Voyage is process-global, while span and agent contexts use Python contextvars (so they follow lexical/async execution context). A raw threading.Thread does not inherit the active span/agent, but can still use the current Voyage for inference correlation.

sail.voyage.run

def run(
    name: str,
    *,
    version: int | None = None,
    metadata: dict | None = None,
    sailbox_id: str | None = None,
) -> ContextManager[Voyage | NoopVoyage]
The recommended entry point: one Voyage around one block of work, with the terminal lifecycle handled for you.
with sail.voyage.run("code-review", version=3) as voyage:
    do_work()
Creates the Voyage on enter (same arguments and semantics as create() — always creates, never reads SAIL_VOYAGE_ID), emits voyage.completed on clean exit, and on an exception emits voyage.failed with the exception’s type and message, then re-raises. Terminal delivery is the same bounded best-effort flush as complete()/fail(); call voyage.flush() inside the block when you need raise-on-failure delivery confirmation. Without SAIL_API_KEY the block runs with telemetry disabled. Controllers whose create and complete sites live in different places keep using create()/attach() directly.

sail.voyage.create

def create(
    name: str,
    *,
    version: int | None = None,
    metadata: dict | None = None,
    sailbox_id: str | None = None,
) -> Voyage | NoopVoyage
Creates a new Voyage, makes it the current Voyage for the process, and emits voyage.started. Always creates — even when SAIL_VOYAGE_ID is set in the environment; a child process joining its parent’s Voyage uses attach() instead.
ParameterDefaultDescription
nameRequired. The series name the dashboard groups runs under.
versionNoneOptional positive integer; bump when the harness/prompts/model mix change.
metadataNoneJSON object (≤ 64 KiB) attached to the Voyage.
sailbox_idNoneBind the Voyage to a Sailbox. Falls back to SAILBOX_ID.
Returns a Voyage. When SAIL_API_KEY is absent it returns a no-op Voyage instead (see below).

sail.voyage.attach

def attach(voyage_id: str | None = None) -> Voyage | NoopVoyage
Attaches to an existing Voyage and makes it the current Voyage for the process. voyage_id defaults to SAIL_VOYAGE_ID — the handoff a parent process sets so its children join the parent’s Voyage (see child-process attach). Raises ValueError when neither is provided and telemetry is enabled; without SAIL_API_KEY it returns a no-op Voyage instead, so a keyless child keeps running with telemetry disabled. Attaching does not emit a second voyage.started.

No-op when unauthenticated

If SAIL_API_KEY is not set, create() and attach() return a NoopVoyage: no Voyage is created, no network calls are made, and id() / dashboard_url() return None. Every method is a safe no-op. This lets the same script run locally without credentials. (Sailbox and inference APIs still require SAIL_API_KEY.)

The Voyage object

create() and attach() return a Voyage with these attributes:
AttributeTypeDescription
idstrVoyage id (None on a no-op Voyage).
dashboard_urlstr | NoneDashboard URL for the Voyage.
statusstr | NoneLatest known terminal/lifecycle status.
namestr | NoneVoyage name.
versionint | NoneVoyage version.
sailbox_idstr | NoneBound Sailbox, if any.
metadatadictMetadata supplied at creation.
It exposes the same operations as the module-level helpers below (event, span, agent, error, complete, fail, flush, headers).

event

def event(
    kind: str,
    level: str = "info",
    message: str | None = None,
    payload: dict | None = None,
    *,
    span_id: str | None = None,
    parent_span_id: str | None = None,
    error_type: str | None = None,
    occurred_at: str | None = None,
    sequence_id: int | None = None,
) -> None
Records a timestamped event on the current span/agent. Agent attribution comes from the enclosing agent() context, or from the SAIL_AGENT_* env defaults when no context is active — there is no per-event override; a one-shot attributed event is with voyage.agent(...): voyage.event(...). Events are buffered locally and flushed by a background thread; event() validates input, enqueues quickly, and does not raise network errors.
ParameterDefaultDescription
kindrequiredEvent kind/name (≤ 128 characters, non-empty).
level"info"One of debug, info, warn, error.
messageNoneHuman-readable message. Truncated at 4 KiB with a warning.
payloadNoneJSON object (≤ 64 KiB).
span_id / parent_span_idNoneOverride span placement; default from the active span.
error_typeNoneError class name for error events.
occurred_atNoneRFC3339 timestamp with timezone; defaults to now.
sequence_idNoneExplicit monotonic ordering id; auto-assigned when omitted.

span

def span(
    span_name: str | None = None,
    *,
    message: str | None = None,
    payload: dict | None = None,
    span_id: str | None = None,
    parent_span_id: str | None = None,
) -> ContextManager  # also usable as a decorator
Usable as a context manager or a decorator@sail.span() names the span after the decorated function’s __qualname__; the with form requires a name. The decorator resolves the current Voyage at call time, so module-level decoration before create() attributes correctly. Decorating a generator function raises TypeError (the context would close at generator creation); async def is fully supported.
@sail.span()
def fetch_sources(urls):
    ...
Returns a context manager that emits span.started on enter and span.completed (or span.failed, with the exception type) on exit. Spans nest: a span opened inside another becomes its child automatically. A span carries no agent identity of its own — wrap it in agent() to attribute it and everything inside it to a named agent.

Span outcomes

The yielded span object accepts outcome data via merge_payload(): the terminal event carries the started payload shallow-merged with everything merged during the span (outcome keys win on conflict; repeated calls accumulate). The span.started event is unchanged, and outcomes ride span.failed too — partial results recorded before a crash are kept.
with voyage.span("score-subject", payload={"subject": "tennis"}) as s:
    result = score(subject)
    s.merge_payload({"score": result.score, "verdict": result.verdict})
with voyage.agent("Reviewer", role="reviewer"):
    with voyage.span("draft-review"):
        voyage.event("review.drafted")

agent

def agent(
    name: str,
    *,
    role: str | None = None,
    slug: str | None = None,
) -> ContextManager
Returns a context manager that marks all events and inference calls inside it as belonging to a named agent. name (the only required argument) is the display identity shown in the dashboard; the stable attribution key is derived from it automatically (lowercased, ASCII, hyphenated). role= is an optional cohort taxonomy used for cross-workflow filtering. slug= (advanced) pins the attribution key explicitly — use it when renaming a display name should keep one identity, or when a child process must attach as the same agent.
with voyage.agent("Reviewer", role="reviewer"):
    ...

Decorator form

agent() and span() double as decorators — the taught spelling for function-shaped work (sail.agent and sail.span are top-level re-exports of the same objects):
import sail

@sail.agent("Researcher", role="researcher")
@sail.span()
def collect_sources(urls):
    return [fetch(url) for url in urls]
Each call enters a fresh agent/span frame (concurrent calls don’t share state) and emits the same events as the with form.

error

def error(
    error_type: str | None = None,
    message: str | None = None,
    payload: dict | None = None,
) -> None
Records an error-level voyage.error event without terminating the Voyage.

complete

def complete(message: str | None = None, payload: dict | None = None) -> None
Emits voyage.completed and performs a bounded (10s) best-effort flush. Always call complete() (or fail()) before the controller process exits — terminal status is first-terminal-wins, and events after a terminal event are best-effort. On delivery failure it warns and returns (the event stays buffered for background/atexit retry) instead of raising; call flush() afterwards if you need raise-on-failure delivery confirmation.

fail

def fail(
    error_type: str = "harness_error",
    message: str | None = None,
    payload: dict | None = None,
) -> None
Emits voyage.failed and performs the same bounded best-effort flush as complete() — it warns instead of raising on delivery failure. error_type must be non-empty.
voyage = sail.voyage.create(name="task")
try:
    do_work()
    voyage.complete(message="ok")
except Exception as exc:
    voyage.fail(error_type=exc.__class__.__name__, message=str(exc))
    raise

flush

def flush(timeout: float | None = None) -> None
Blocks until all buffered events are delivered, raising on delivery failure. A bounded best-effort flush also runs automatically at process exit, but that is not a substitute for complete()/fail() for product-critical terminal state.

headers

def headers(existing: Mapping[str, str] | None = None) -> dict[str, str]
Returns a copy of existing with the full attribution context set: X-Sail-Voyage-Id for the current Voyage, plus X-Sail-Voyage-Span-Id and X-Sail-Voyage-Agent-Id for the span/agent active at call time. Use this to correlate a raw HTTP/OpenAI client with the Voyage when you can’t use the inference wrappers. Compute it per request — never once at client construction — so each call carries the context actually active when it is made.

child_env

def child_env(*, agent: bool = True) -> dict[str, str]
Env vars a child process needs to attach() to the current Voyage: merge into the child’s environment instead of exporting SAIL_VOYAGE_ID by hand. With agent=True (default) the active agent() context rides along as the child’s SAIL_AGENT_* defaults. Returns {} when telemetry is disabled, so the handoff is keyless-safe.
subprocess.run(["python", "worker.py"], env={**os.environ, **sail.voyage.child_env()})

disable

def disable() -> NoopVoyage
Disables Voyage telemetry for this process by installing a no-op current Voyage — the public form of the state create()/attach() enter when SAIL_API_KEY is absent. For controllers that catch a startup telemetry failure and choose to continue unobserved.

Module helpers

def id() -> str | None
def dashboard_url() -> str | None
sail.voyage.id() and sail.voyage.dashboard_url() return the current Voyage’s id and dashboard URL (or None when there is no current Voyage or it is a no-op).

Environment variables

VariablePurpose
SAIL_API_KEYRequired for real Voyage events and inference. Use an org-bearing sk_... key.
SAIL_MODESelects the SDK routing mode (prod default; dev, staging).
SAIL_API_URLOverrides the REST API URL directly.
SAILBOX_IDDefault sailbox_id attached to new Voyages.
SAIL_VOYAGE_IDDefault voyage id for attach().
SAIL_AGENT_IDDefault event agent_id; normalized to the derived slug form, so it matches agent() in a parent process.
SAIL_AGENT_NAMEDefault event agent_name.
SAIL_AGENT_ROLEDefault event agent_role.
SAIL_VOYAGE_DEBUGAdds per-occurrence repeats of degradation warnings. One warning per category (no-op mode, dropped events, stubbed payloads, failed flushes) is always emitted.

Validation and delivery semantics

  • Validation: kind ≤ 128 characters; level one of debug/info/warn/error; payload and metadata must be JSON objects; occurred_at must be RFC3339 with a timezone; version must be a positive integer. Oversized human text does not raise: a message or span name over 4 KiB is truncated, and a payload over 64 KiB is replaced by a {"_truncated": true, "_original_bytes": N} stub — each with a warning. metadata over 64 KiB raises at create() (startup is when failing is cheapest). create() and attach() validate their arguments before the no-key gate, so a malformed call raises even when telemetry is disabled.
  • Buffering: events go to a bounded local buffer drained by a background flusher (≈ 1s interval). When the buffer is full, the oldest non-terminal events are dropped first and a sdk.events_dropped notice is emitted; terminal events (voyage.started/completed/failed) are preserved.
  • Delivery semantics: flush() blocks and raises delivery errors (sail.VoyageError and subclasses) — the strict primitive. complete() and fail() perform a bounded best-effort flush and warn instead of raising. event() never raises network errors.