Skip to main content
  • Sail is most optimized for the 15m completion window today. More windows like 3m and 1h will be added soon.
  • You must set background=True for all 15m and 24h requests. Requests with background=False, or a missing background field, will be treated as asap requests.
  • The asap window is provided mostly for convenience. Sail is at parity with other providers, but doesn’t specialize in this tier.
  • Prompt caching is implicit, based on prefix matching.
  • Each model row links the exact Hugging Face checkpoint Sail currently serves. If we offer multiple quantizations, we list them as separate model IDs.
All prices are listed in USD per 1M tokens (MTok).
Model ID15 min24 hourASAP
InCachedOutInCachedOutInCachedOut
moonshotai/Kimi-K2.5$0.10$0.03$0.80$0.05$0.015$0.40$0.60$0.10$3.00
zai-org/GLM-4.7$0.12$0.05$0.60$0.06$0.025$0.30$0.60$0.30$2.20
zai-org/GLM-5$0.16$0.06$0.85$0.08$0.03$0.40$1.00$0.20$3.20
zai-org/GLM-5.1-FP8$0.25$0.12$2.00$0.125$0.06$1.00$1.40$0.26$4.40
deepseek-ai/DeepSeek-V3.2$0.08$0.02$0.50$0.04$0.01$0.25$0.56$0.28$1.68
openai/gpt-oss-20b$0.01$0.005$0.05$0.005$0.001$0.02$0.06$0.03$0.30
openai/gpt-oss-120b$0.015$0.006$0.07$0.007$0.001$0.025$0.08$0.04$0.40
Use GET /v1/models in your target environment to confirm runtime availability for your API key.