Introduction

Sail provides OpenAI- and Anthropic-compatible inference APIs. You can use the OpenAI SDK, Anthropic SDK, or raw HTTP clients.

Alpha — Chat Completions and Messages API support is still in active development. Behavior and supported fields may change without notice. We recommend the Responses API for production workloads.

Authentication

Include your API key as a Bearer token in the Authorization header of every request:

Authorization: Bearer YOUR_SAIL_API_KEY

Supported endpoints

API	Endpoint	Status
Models	`GET /v1/models`	Supported
OpenAI Responses	`POST /v1/responses`	Supported
OpenAI Responses	`GET /v1/responses/{response_id}`	Supported
OpenAI Chat Completions	`POST /v1/chat/completions`	Supported
Anthropic Messages	`POST /v1/messages`	Supported

Not currently supported:

DELETE /v1/responses/{response_id}
POST /v1/responses/{response_id}/cancel
POST /v1/messages/batches and related batch endpoints

Shared request metadata

All three inference APIs support metadata with string values. These keys are especially useful:

completion_window: "asap", "15m", or "24h"
completion_webhook: webhook URL for completion notification
webhook_token: optional bearer token that Sail includes when calling your webhook

Responses API

Supported endpoints:

POST /v1/responses
GET /v1/responses/{response_id}

Supported fields include:

model, input
max_output_tokens, temperature, top_p
text.format with:
- {"type":"text"}
- {"type":"json_schema", "name": "...", "schema": {...}}
reasoning.effort (low, medium, high)
reasoning.generate_summary (auto, concise, detailed)
background
metadata
user

Current limitations:

Text-only input (no image/audio/file input blocks)
stream=true is not supported
Tools are not supported (tools, parallel_tool_calls, custom tool_choice)
store=false is not supported
truncation only supports "disabled"

Chat Completions API

Supported endpoint:

POST /v1/chat/completions

Supported fields include:

model, messages
temperature, top_p
max_completion_tokens
response_format with text or json_schema
reasoning_effort
metadata
user

Current limitations:

Text-only message content (no image/audio content parts)
stream=true is not supported
Tools/function-calling fields are not supported
n must be 1 when provided

Messages API

Supported endpoint:

POST /v1/messages

Supported fields include:

model, max_tokens, messages
temperature, top_p
output_config.format with type: "json_schema"
metadata

Current limitations:

Text-only content blocks
stream=true is not supported
system, tools, tool_choice, top_k, thinking, and service_tier are not supported

Explore the full endpoint-level request and response schemas in the API reference tab.

Models API

Responses API

Chat Completions API

Messages API

Batches API

Authentication

Supported endpoints

Shared request metadata

Responses API

Chat Completions API

Messages API

Models API

Responses API

Chat Completions API

Messages API

Batches API

​Authentication

​Supported endpoints

​Shared request metadata

​Responses API

​Chat Completions API

​Messages API

Authentication

Supported endpoints

Shared request metadata

Responses API

Chat Completions API

Messages API