Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.sailresearch.com/llms.txt

Use this file to discover all available pages before exploring further.

Showing windows
Avg turn time of ~5 minutes. Appropriate for standard long-running agent workloads.Avg turn time of ~1 minute. Appropriate for latency-sensitive agent loops.Best-effort scheduling. Appropriate for batch processing, evals, and offline workloads.Immediate response on the fastest hardware. Appropriate for interactive UIs and human-in-the-loop.
StandardAvg turn time of ~5 minutes. Appropriate for standard long-running agent workloads.PriorityAvg turn time of ~1 minute. Appropriate for latency-sensitive agent loops.FlexBest-effort scheduling. Appropriate for batch processing, evals, and offline workloads.ASAPImmediate response on the fastest hardware. Appropriate for interactive UIs and human-in-the-loop.
USDper 1M tokens
ModelInputCachedOutput
Kimi K2.5
moonshotai/Kimi-K2.5
Standard0.20Priority0.25Flex0.16ASAP0.60
Standard0.10Priority0.15Flex0.05ASAP0.10
Standard1.20Priority1.80Flex0.80ASAP3.00
GLM-5.1 (FP8)
zai-org/GLM-5.1-FP8
Standard0.50Priority0.70Flex0.40ASAP1.40
Standard0.12Priority0.18Flex0.08ASAP0.26
Standard2.50Priority3.00Flex1.80ASAP4.40
DeepSeek V3.2
deepseek-ai/DeepSeek-V3.2
Flex0.04ASAP0.56
Flex0.01ASAP0.28
Flex0.25ASAP1.68
gpt-oss-20b
openai/gpt-oss-20b
Flex0.005ASAP0.06
Flex0.001ASAP0.03
Flex0.02ASAP0.30
gpt-oss-120b
openai/gpt-oss-120b
Flex0.007ASAP0.08
Flex0.001ASAP0.04
Flex0.025ASAP0.40
MiniMax M2.7
MiniMaxAI/MiniMax-M2.7
Flex0.06ASAP0.30
Flex0.015ASAP0.06
Flex0.30ASAP1.20
Gemma 4 31B IT
google/gemma-4-31B-it
Flex0.06ASAP0.14
Flex0.02ASAP0.07
Flex0.30ASAP0.40
  • Sail supports four completion windows: standard, priority, flex, and asap. Toggle the pills above the table to compare them. See Completion Windows for details.
  • Prompt caching is implicit, based on prefix matching. Optionally, you may use prompt_cache_key as a routing hint to help maximize cache hit rates.
  • Requests for windows a model doesn’t support (rows omitted in the table above) automatically route to flex for scheduling and billing.
  • See Models for capabilities and other details on supported models.