Introduction

Sail is a max-efficiency inference provider. We serve open source models at massive scale, and prioritize throughput over latency. Use Sail to build agents that tackle big tasks, with minimal human involvement. Sail currently supports the OpenAI Responses API at /v1/responses. Full support for the OpenAI Chat Completions API and Anthropic Messages API is coming soon. All APIs use Bearer token authentication with your Sail API key.

Quickstart

Make your first API request!

Quickstart

Getting Started

Guides

Security

Async-Specific Features

Account

Quickstart