Skip to main content
Sail provides OpenAI- and Anthropic-compatible inference APIs. You can use the OpenAI SDK, Anthropic SDK, or raw HTTP clients.
Alpha — Chat Completions and Messages API support is still in active development. Behavior and supported fields may change without notice. We recommend the Responses API for production workloads.

Authentication

Include your API key as a Bearer token in the Authorization header of every request:
Authorization: Bearer YOUR_SAIL_API_KEY

Supported endpoints

APIEndpointStatus
ModelsGET /v1/modelsSupported
OpenAI ResponsesPOST /v1/responsesSupported
OpenAI ResponsesGET /v1/responses/{response_id}Supported
OpenAI Chat CompletionsPOST /v1/chat/completionsSupported
Anthropic MessagesPOST /v1/messagesSupported
Not currently supported:
  • DELETE /v1/responses/{response_id}
  • POST /v1/responses/{response_id}/cancel
  • POST /v1/messages/batches and related batch endpoints

Shared request metadata

All three inference APIs support metadata with string values. These keys are especially useful:
  • completion_window: "asap", "15m", or "24h"
  • completion_webhook: webhook URL for completion notification
  • webhook_token: optional bearer token that Sail includes when calling your webhook

Responses API

Supported endpoints:
  • POST /v1/responses
  • GET /v1/responses/{response_id}
Supported fields include:
  • model, input
  • max_output_tokens, temperature, top_p
  • text.format with:
    • {"type":"text"}
    • {"type":"json_schema", "name": "...", "schema": {...}}
  • reasoning.effort (low, medium, high)
  • reasoning.generate_summary (auto, concise, detailed)
  • background
  • metadata
  • user
Current limitations:
  • Text-only input (no image/audio/file input blocks)
  • stream=true is not supported
  • Tools are not supported (tools, parallel_tool_calls, custom tool_choice)
  • store=false is not supported
  • truncation only supports "disabled"

Chat Completions API

Supported endpoint:
  • POST /v1/chat/completions
Supported fields include:
  • model, messages
  • temperature, top_p
  • max_completion_tokens
  • response_format with text or json_schema
  • reasoning_effort
  • metadata
  • user
Current limitations:
  • Text-only message content (no image/audio content parts)
  • stream=true is not supported
  • Tools/function-calling fields are not supported
  • n must be 1 when provided

Messages API

Supported endpoint:
  • POST /v1/messages
Supported fields include:
  • model, max_tokens, messages
  • temperature, top_p
  • output_config.format with type: "json_schema"
  • metadata
Current limitations:
  • Text-only content blocks
  • stream=true is not supported
  • system, tools, tool_choice, top_k, thinking, and service_tier are not supported
Explore the full endpoint-level request and response schemas in the API reference tab.