Skip to main content
When you create a request through POST /v1/responses, POST /v1/chat/completions, or POST /v1/messages, you can provide a completion webhook in request metadata. When processing finishes, Sail will POST the full response payload to your URL so you can process it without polling.

Enabling a completion webhook

Include a completion_webhook URL in the metadata object of your create request. The URL must be http or https.
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_SAIL_API_KEY",
    base_url="https://api.sailresearch.com/v1",
)

response = client.responses.create(
    model="moonshotai/Kimi-K2.5",
    input="Summarize this document.",
    background=True,
    metadata={
        "completion_webhook": "https://your-server.com/sail-completion",
    },
)
If metadata.completion_webhook is omitted or invalid, no webhook request is sent. The create call and the response itself are unchanged; webhooks are optional and best-effort. For Chat Completions and Anthropic Messages, pass the same metadata keys (completion_webhook, webhook_token) on the request body.

Webhook payload

Sail sends a POST request to your URL with:
  • Content-Type: application/json
  • Body: The same JSON object returned by GET /v1/responses/{response_id}

Securing webhooks with a token

To verify that incoming requests are from Sail, set webhook_token in the metadata. Sail will send the value of webhook_token as a Bearer token in the Authorization header of the webhook POST.
response = client.responses.create(
    model="moonshotai/Kimi-K2.5",
    input="Summarize this document.",
    background=True,
    metadata={
        "completion_webhook": "https://your-server.com/sail-completion",
        "webhook_token": "your-secret-token",
    },
)
Your server can check Authorization: Bearer your-secret-token and reject requests that don’t match.

Delivery behavior

  • Retries: Sail retries failed delivery up to 3 times (e.g. non-2xx status or network errors). Respond with a 2xx status as soon as you have accepted the payload so that Sail stops retrying.
  • Timeout: Each attempt has a 30-second timeout. If the request times out or fails, the next attempt is made.
  • Best-effort: Webhook failures are logged but do not affect the response or the API. The response remains available via GET /v1/responses/{response_id} even if the webhook never succeeds.

Full example

Here’s a full, end-to-end example using ngrok: 1. Start a local webhook listener that prints the payload and returns 200:
python -c "
from http.server import HTTPServer, BaseHTTPRequestHandler; import json
class H(BaseHTTPRequestHandler):
 def do_POST(self):
  print(json.dumps(json.loads(self.rfile.read(int(self.headers['Content-Length']))), indent=2))
  self.send_response(200); self.end_headers()
HTTPServer(('127.0.0.1', 8765), H).serve_forever()
"
2. In a second terminal, expose it with ngrok:
ngrok http 8765
Copy the https://xxxx.ngrok-free.app forwarding URL from the output. 3. In a third terminal, create a response with the webhook:
curl -X POST https://api.sailresearch.com/v1/responses \
  -H "Authorization: Bearer YOUR_SAIL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "moonshotai/Kimi-K2.5",
    "input": "What is 2+2? Reply with just the number.",
    "background": true,
    "metadata": {
      "completion_webhook": "https://xxxx.ngrok-free.app"
    }
  }'
When the response completes, Sail POSTs the full payload to your listener.