sail.SailTokenCompleter is a drop-in tinker-cookbook TokenCompleter that samples from your Tinker checkpoints on Sail — no manual adapter upload step.
Token IDs go in and token IDs come out. The completer sends your prompt token IDs to Sail verbatim and returns sampled token IDs with per-token logprobs, so there is no chat-template or re-tokenization drift between training and sampling. Each call creates a background Responses request on the completion window you choose (priority by default) and polls it to completion, retrying transient failures with exponential backoff.
To find a guide on how to use sail.SailTokenCompleter in a GRPO-style Tinker training loop, see the following guide.
Install
Sample on Sail
SailTokenCompleter works anywhere tinker-cookbook expects a TokenCompleter (i.e. RL rollouts, evals, or direct calls):
Parameters
| Parameter | Default | Description |
|---|---|---|
model | (required) | Sail model ID. Must support LoRA serving when a LoRA source is set — see LoRAs. |
max_tokens | (required) | Maximum sampled tokens per call. |
temperature | 1.0 | Sampling temperature. |
top_p | 1.0 | Nucleus sampling threshold. |
completion_window | "priority" | Completion window for each request. LoRA requests cannot use asap; the selected window must be supported by the model. |
lora | None | Name or ID of a LoRA uploaded to Sail. |
tinker_lora_signed_url | None | Signed Tinker checkpoint archive URL. Mutually exclusive with lora. |
adapter_config | None | PEFT adapter_config.json contents (dict or JSON string). Required with tinker_lora_signed_url. |
tinker_lora_name | None | Optional label for the Tinker checkpoint. |
metadata | None | Extra request metadata merged into each request. |
timeout | None | Per-HTTP-call timeout in seconds. |
request_logprobs | True | Request per-token logprobs with each sample. |
stop argument on the call itself accepts a string, a list of strings, or token IDs, matching the tinker-cookbook TokenCompleter contract.
Sample from a Tinker checkpoint
To sample from a LoRA you are training in Tinker, save sampler weights, resolve a signed archive URL, and pass both the URL and the adapter’s PEFT config to the completer. Sail downloads the checkpoint archive and loads the adapter for your requests.adapter_config is the PEFT adapter config for the LoRA Tinker is training. The same compatibility rules apply as for uploaded LoRAs: the base model must match model, and the rank must be within the base model’s limit.
When ttl_seconds is passed to get_tinker_checkpoint_signed_url_async, the helper sets the Tinker checkpoint’s TTL before resolving the URL, so per-step RL sampler checkpoints are cleaned up automatically instead of accumulating in your Tinker account.
Using an uploaded LoRA instead
If you have already uploaded a LoRA to Sail, pass its name or ID aslora instead of a signed URL:
Constraints
- Tinker checkpoints only apply through
SailTokenCompleter. The adapter is loaded on Sail’s raw-token sampling path. A plain text Responses or Chat Completions request that happens to carry Tinker checkpoint metadata is served by the base model. Sample from Tinker checkpoints only viaSailTokenCompleter. loraandtinker_lora_signed_urlare mutually exclusive. Pass one LoRA source per completer.adapter_configis required withtinker_lora_signed_url. Sail needs the PEFT config to load the checkpoint weights.modelmust support LoRA serving when a LoRA source is set (see supported base models).- LoRA requests cannot use the
asapcompletion window. Setcompletion_windowtopriority(the default),standard, orflexwhen the selected model supports that window (see Completion Windows). - Signed checkpoint URLs expire. Resolve a fresh URL for each new checkpoint, and re-resolve if a long-running loop reuses an old one.
- tinker-cookbook must be installed. Constructing a
SailTokenCompleterwithout it raises an error; the rest of thesailSDK works without Tinker packages.