- Sail is most optimized for the
15mcompletion window today. More windows like3mand1hwill be added soon. - You must set
background=Truefor all15mand24hrequests. Requests withbackground=False, or a missingbackgroundfield, will be treated asasaprequests. - The
asapwindow is provided mostly for convenience. Sail is at parity with other providers, but doesn’t specialize in this tier. - Prompt caching is implicit, based on prefix matching.
- Each model row links the exact Hugging Face checkpoint Sail currently serves. If we offer multiple quantizations, we list them as separate model IDs.
| Model ID | 15 min | 24 hour | ASAP | ||||||
|---|---|---|---|---|---|---|---|---|---|
| In | Cached | Out | In | Cached | Out | In | Cached | Out | |
moonshotai/Kimi-K2.5 | $0.10 | $0.03 | $0.80 | $0.05 | $0.015 | $0.40 | $0.60 | $0.10 | $3.00 |
zai-org/GLM-4.7 | $0.12 | $0.05 | $0.60 | $0.06 | $0.025 | $0.30 | $0.60 | $0.30 | $2.20 |
zai-org/GLM-5 | $0.16 | $0.06 | $0.85 | $0.08 | $0.03 | $0.40 | $1.00 | $0.20 | $3.20 |
zai-org/GLM-5.1-FP8 | $0.25 | $0.12 | $2.00 | $0.125 | $0.06 | $1.00 | $1.40 | $0.26 | $4.40 |
deepseek-ai/DeepSeek-V3.2 | $0.08 | $0.02 | $0.50 | $0.04 | $0.01 | $0.25 | $0.56 | $0.28 | $1.68 |
openai/gpt-oss-20b | $0.01 | $0.005 | $0.05 | $0.005 | $0.001 | $0.02 | $0.06 | $0.03 | $0.30 |
openai/gpt-oss-120b | $0.015 | $0.006 | $0.07 | $0.007 | $0.001 | $0.025 | $0.08 | $0.04 | $0.40 |
Use
GET /v1/models in
your target environment to confirm runtime availability for your API key.