Skip to main content
The sail.tinker helpers let you drive Sail inference from a Tinker RL or training loop. They bridge Tinker’s token-level sampling interface to Sail’s raw-token Responses path, so rollouts run against Sail-hosted models (optionally with a LoRA adapter) while logprobs flow back into your training code.
These helpers require tinker-cookbook installed alongside sail-sdk. Constructing a SailTokenCompleter without tinker-cookbook available raises sail.InferenceError.

sail.SailTokenCompleter

A Tinker TokenCompleter backed by Sail’s raw-token Responses API. Construct one with a model and sampling settings, then await it on tokenized prompts to get sampled tokens and their logprobs.
import sail

completer = sail.SailTokenCompleter(
    model="meta-llama/Llama-3.1-8B-Instruct",
    max_tokens=256,
    temperature=0.7,
    completion_window="priority",
)

result = await completer(model_input)
print(result.tokens)         # list[int] of sampled token ids
print(result.maybe_logprobs) # list[float] | None
print(result.stop_reason)    # e.g. "stop", "length"

Constructor

SailTokenCompleter(
    *,
    model: str,
    max_tokens: int,
    temperature: float = 1.0,
    top_p: float = 1.0,
    completion_window: str = "priority",
    lora: str | None = None,
    tinker_lora_signed_url: str | None = None,
    adapter_config: Mapping[str, Any] | str | None = None,
    tinker_lora_name: str | None = None,
    metadata: Mapping[str, str] | None = None,
    timeout: float | None = None,
    voyage: sail.Voyage | None = None,
    request_logprobs: bool = True,
)
ParameterDescription
modelRequired. Model id to sample from. Raises ValueError if empty.
max_tokensRequired. Max tokens to generate; must be > 0 (ValueError otherwise).
temperatureSampling temperature. Default 1.0.
top_pNucleus sampling cutoff. Default 1.0.
completion_windowCompletion window for each request. Default "priority". LoRA requests cannot use "asap"; the selected window must be supported by the model.
loraName of a Sail-registered LoRA adapter to apply. Mutually exclusive with tinker_lora_signed_url.
tinker_lora_signed_urlSigned URL to a Tinker checkpoint archive (see get_tinker_checkpoint_signed_url_async). Requires adapter_config. Mutually exclusive with lora.
adapter_configLoRA adapter config as a mapping or JSON string. Required when tinker_lora_signed_url is set.
tinker_lora_nameOptional human-readable name attached to the Tinker LoRA.
metadataExtra string metadata forwarded on each request.
timeoutPer-request timeout in seconds.
voyageA sail.Voyage to attribute requests to a voyage.
request_logprobsWhether to request logprobs from the server. Default True.
Passing both lora and tinker_lora_signed_url, or setting tinker_lora_signed_url without adapter_config, raises ValueError.

async __call__(model_input, stop=None)

async def __call__(model_input, stop=None) -> TokensWithLogprobs
  • model_input — must expose a callable .to_ints() returning the prompt token ids (this is Tinker’s ModelInput). A non-callable to_ints, a non-integer token, or an empty prompt raises TypeError/ValueError.
  • stop — optional stop condition. An int is wrapped as a single-element list; a tuple is converted to a list; other values pass through unchanged.
Returns a Tinker TokensWithLogprobs: | Field | Description | | ---------------- | ---------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------- | | tokens | list[int] — the sampled token ids. | | maybe_logprobs | list[float] | None— per-token logprobs, orNone when the response carried none. | | stop_reason | The model’s stop reason (e.g. "stop", "length"), falling back to the response status or "unknown". | If the Sail response is malformed (missing output_text, missing/ non-integer token_ids, or mismatched token/logprob lengths), a sail.InferenceError is raised with the offending response attached as exc.response.

get_tinker_checkpoint_signed_url_async

await get_tinker_checkpoint_signed_url_async(
    service_client,
    tinker_path: str,
    *,
    ttl_seconds: int | None = None,
) -> str
Resolves a Tinker checkpoint path to a signed archive URL, suitable for passing as tinker_lora_signed_url to SailTokenCompleter. It calls the Tinker service client’s REST client (get_checkpoint_archive_url_from_tinker_path_async) and unwraps the URL from the result (accepting a bare string, or url / signed_url / archive_url / checkpoint_archive_url on an object or mapping). This helper is async-only and requires a Tinker REST client with async checkpoint URL methods.
ParameterDescription
service_clientA Tinker ServiceClient (exposes create_rest_client()).
tinker_pathThe Tinker checkpoint path to resolve.
ttl_secondsOptional checkpoint TTL to set or extend before resolving the signed URL.
Raises sail.InferenceError if the Tinker client does not provide async checkpoint URL methods, or if the response does not contain a URL.
import sail

signed_url = await sail.get_tinker_checkpoint_signed_url_async(
    service_client,
    tinker_path,
    ttl_seconds=3600,
)

completer = sail.SailTokenCompleter(
    model="meta-llama/Llama-3.1-8B-Instruct",
    max_tokens=256,
    completion_window="priority",
    tinker_lora_signed_url=signed_url,
    adapter_config={"r": 16, "alpha": 32},
)