Create a response - Sail Research

curl --request POST \ --url https://api.sailresearch.com/v1/responses \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "moonshotai/Kimi-K2.5", "input": "Explain the key ideas behind transformer architectures." } '

{ "id": "<string>", "created_at": 123, "model": "<string>", "usage": { "input_tokens": 123, "input_tokens_details": { "cached_tokens": 123, "reasoning_tokens": 123 }, "output_tokens": 123, "output_tokens_details": { "cached_tokens": 123, "reasoning_tokens": 123 }, "total_tokens": 123, "prompt_tokens": 123, "completion_tokens": 123 }, "metadata": {}, "input": "<string>", "output": "<string>", "error": {}, "incomplete_details": {}, "max_output_tokens": 123, "reasoning": {}, "text": { "format": {} }, "store": true, "temperature": 123, "top_p": 123, "parallel_tool_calls": true, "tool_choice": "<string>", "tools": [ {} ], "truncation": "<string>", "user": "<string>" }

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

Idempotency-Key

string

Makes the request retry-safe. Sail stores a reservation keyed by (organization, API key, Idempotency-Key); retrying with the same value returns the previously stored response instead of re-running inference. Keys are capped at 255 characters. See Idempotent Requests for full semantics.

Maximum string length: 255

Body

application/json

model

string

required

input

required

Text-only input. Images, audio, files, and item references are not currently supported.

Minimum string length: 1

max_output_tokens

integer | null

Required range: x >= 1

temperature

number | null

Required range: 0 <= x <= 2

top_p

number | null

Required range: 0 <= x <= 1

text

object

Show child attributes

reasoning

object

Show child attributes

background

boolean

prompt_cache_key

string

Optional routing hint for prompt-prefix cache locality. Requests with the same key are preferentially routed to maximize cache hit rates.

store

enum<boolean>

Only true is supported.

Available options:

true

truncation

enum<string>

Available options:

disabled

stream

enum<boolean>

Streaming is not currently supported.

Available options:

false

user

string

Maximum string length: 256

metadata

object

Optional string metadata. completion_window controls scheduling; completion_webhook/webhook_token configure completion webhooks.

Show child attributes

Response

Response completed and returned synchronously.

string

required

object

enum<string>

required

Available options:

response

created_at

integer

required

status

enum<string>

required

Available options:

pending,

running,

failed,

completed,

cancelled

model

string

required

usage

object

required

Show child attributes

metadata

object

required

input

Text-only input. Images, audio, files, and item references are not currently supported.

Minimum string length: 1

output

error

object

incomplete_details

object

max_output_tokens

integer | null

reasoning

object

text

object

Show child attributes

store

boolean

temperature

number

top_p

number

parallel_tool_calls

boolean

tool_choice

tools

object[]

truncation

user

string | null

Documentation Index

Authorizations

Headers

Body

Response