Skip to main content
POST
/
responses
curl --request POST \
  --url https://api.sailresearch.com/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "moonshotai/Kimi-K2.5",
  "input": "Explain the key ideas behind transformer architectures."
}
'
{
  "id": "<string>",
  "object": "response",
  "created_at": 123,
  "status": "pending",
  "model": "<string>",
  "usage": {
    "input_tokens": 123,
    "input_tokens_details": {
      "cached_tokens": 123,
      "reasoning_tokens": 123
    },
    "output_tokens": 123,
    "output_tokens_details": {
      "cached_tokens": 123,
      "reasoning_tokens": 123
    },
    "total_tokens": 123,
    "prompt_tokens": 123,
    "completion_tokens": 123
  },
  "metadata": {},
  "input": "<string>",
  "output": "<string>",
  "error": {},
  "incomplete_details": {},
  "max_output_tokens": 123,
  "reasoning": {},
  "text": {
    "format": {
      "type": "text"
    }
  },
  "store": true,
  "temperature": 123,
  "top_p": 123,
  "parallel_tool_calls": true,
  "tool_choice": "<string>",
  "tools": [
    {}
  ],
  "truncation": "<string>",
  "user": "<string>"
}

Documentation Index

Fetch the complete documentation index at: https://docs.sailresearch.com/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

Idempotency-Key
string

Makes the request retry-safe. Sail stores a reservation keyed by (organization, API key, Idempotency-Key); retrying with the same value returns the previously stored response instead of re-running inference. Keys are capped at 255 characters. See Idempotent Requests for full semantics.

Maximum string length: 255

Body

application/json
model
string
required
input
required

Text-only input. Images, audio, files, and item references are not currently supported.

Minimum string length: 1
max_output_tokens
integer | null
Required range: x >= 1
temperature
number | null
Required range: 0 <= x <= 2
top_p
number | null
Required range: 0 <= x <= 1
text
object
reasoning
object
background
boolean
prompt_cache_key
string

Optional routing hint for prompt-prefix cache locality. Requests with the same key are preferentially routed to maximize cache hit rates.

store
enum<boolean>

Only true is supported.

Available options:
true
truncation
enum<string>
Available options:
disabled
stream
enum<boolean>

Streaming is not currently supported.

Available options:
false
user
string
Maximum string length: 256
metadata
object

Optional string metadata. completion_window controls scheduling; completion_webhook/webhook_token configure completion webhooks.

Response

Response completed and returned synchronously.

id
string
required
object
enum<string>
required
Available options:
response
created_at
integer
required
status
enum<string>
required
Available options:
pending,
running,
failed,
completed,
cancelled
model
string
required
usage
object
required
metadata
object
required
input

Text-only input. Images, audio, files, and item references are not currently supported.

Minimum string length: 1
output
error
object
incomplete_details
object
max_output_tokens
integer | null
reasoning
object
text
object
store
boolean
temperature
number
top_p
number
parallel_tool_calls
boolean
tool_choice
tools
object[]
truncation
user
string | null