Skip to main content
The Usage API lets you query your Sail usage metrics from scripts, CLIs, or dashboards. It uses the same API key you use for api.sailresearch.com.

Base URL

https://usage.sailresearch.com

Authentication

All requests require a Bearer API key in the Authorization header.
This is the same API key you use for the inference API at api.sailresearch.com. Create or manage keys from the Sail dashboard.
import requests

headers = {"Authorization": "Bearer YOUR_SAIL_API_KEY"}
resp = requests.get("https://usage.sailresearch.com/v1/usage", headers=headers)
print(resp.json())

GET /v1/usage

Returns a spending and balance summary for the requested time range.
ParameterValuesDefaultDescription
range24h, 7d, 30d, period30dTime window. period = current billing period.
curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
  "https://usage.sailresearch.com/v1/usage?range=7d" | jq
{
  "object": "usage.summary",
  "range": "7d",
  "period_spend": 5432,
  "burn_rate": 776,
  "balance": 10000,
  "balance_unavailable": false,
  "days_remaining": 12,
  "plan_type": "self_serve",
  "model_count": 3,
  "tokens": {
    "total": 1000000,
    "input": 600000,
    "output": 350000,
    "cached": 50000
  },
  "avg_cost_per_day": 776,
  "sla_mix": {
    "15m": 0.7,
    "1h": 0.3
  },
  "prior_period": {
    "period_spend": 4000,
    "model_count": 2,
    "tokens": {
      "total": 800000,
      "input": 500000,
      "output": 280000,
      "cached": 20000
    },
    "avg_cost_per_day": 571
  }
}
All monetary values are in cents. prior_period covers the equivalent window immediately before the requested range, for comparison.

GET /v1/usage/breakdown

Returns time-series spend data and per-model ranking. Use this for charting.
ParameterValuesDefaultDescription
range24h, 7d, 30d, period, day30dTime window.
dateYYYY-MM-DDRequired when range=day. Drills into hourly data for that date.
curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
  "https://usage.sailresearch.com/v1/usage/breakdown?range=7d" | jq
{
  "object": "usage.breakdown",
  "range": "7d",
  "granularity": "day",
  "data": [
    {
      "timestamp": "2025-01-15",
      "total": 800,
      "models": {
        "zai-org/GLM-5": {
          "total": 500,
          "tokens": 50000,
          "input_tokens": 30000,
          "output_tokens": 15000,
          "cached_tokens": 5000
        }
      },
      "slas": { "15m": 800 }
    }
  ],
  "models": [
    {
      "model": "zai-org/GLM-5",
      "total": 3500,
      "tokens": 350000,
      "input_tokens": 210000,
      "output_tokens": 105000,
      "cached_tokens": 35000,
      "slas": { "15m": 3500 },
      "percentage": 0.65
    }
  ]
}
granularity is "day" for 7d/30d/period and "hour" for 24h or day drill-down.

GET /v1/usage/activity

Returns operational metrics: request counts, token throughput, latency, and recent requests.
ParameterValuesDefaultDescription
limit110010Number of recent requests to return.
curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
  https://usage.sailresearch.com/v1/usage/activity | jq
{
  "object": "usage.activity",
  "available": true,
  "requests": {
    "last_1m": 5,
    "last_1h": 120,
    "last_24h": 2000,
    "last_7d": 14000
  },
  "tokens": {
    "last_1m": 500,
    "last_1h": 12000,
    "last_24h": 200000,
    "last_7d": 1400000,
    "token_breakdown_1h": {
      "input": 8000,
      "output": 3500,
      "cached": 500
    }
  },
  "latency": {
    "avg_1m_ms": 1200,
    "avg_1h_ms": 1500
  },
  "recent_requests": [
    {
      "response_id": "resp_abc123",
      "model": "zai-org/GLM-5",
      "sla": "15m",
      "status": "completed",
      "created_at": "2025-01-15T10:00:00Z",
      "updated_at": "2025-01-15T10:01:00Z"
    }
  ],
  "has_more": false
}
If has_more is true, there are additional recent requests beyond the requested limit.
Activity data may have a slight delay and is not a real-time stream.

GET /v1/usage/activity/timeseries

Returns bucketed time-series data for requests or tokens. Use this for throughput charts.
ParameterValuesDefaultDescription
typerequests, tokensrequestsWhat to chart.
range1h, 6h, 24h24hTime window. Bucket size: 1min / 5min / 1hr respectively.
Requests by model:
curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
  "https://usage.sailresearch.com/v1/usage/activity/timeseries?type=requests&range=1h" | jq
{
  "object": "usage.activity.timeseries",
  "type": "requests",
  "range": "1h",
  "available": true,
  "series": [
    {
      "time_bucket": "2025-01-15T10:00:00Z",
      "model": "zai-org/GLM-5",
      "count": 5
    },
    {
      "time_bucket": "2025-01-15T10:01:00Z",
      "model": "zai-org/GLM-5",
      "count": 3
    }
  ]
}
Token breakdown:
curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
  "https://usage.sailresearch.com/v1/usage/activity/timeseries?type=tokens&range=24h" | jq
{
  "object": "usage.activity.timeseries",
  "type": "tokens",
  "range": "24h",
  "available": true,
  "series": [
    {
      "time_bucket": "2025-01-15T10:00:00Z",
      "total_tokens": 1000,
      "input_tokens": 600,
      "output_tokens": 350,
      "cached_tokens": 50
    }
  ]
}

Errors

All errors follow the same format used by the inference API:
{
  "error": {
    "message": "Missing or invalid Authorization header",
    "type": "authentication_error",
    "param": null,
    "code": null
  }
}
HTTP StatustypeWhen
400invalid_request_errorBad query parameter, missing required param
401authentication_errorMissing, invalid, or expired API key
402billing_errorKey disabled due to insufficient credits
429rate_limit_errorRate limit exceeded (includes Retry-After header)
500api_errorInternal server error

Response headers

All responses include X-Request-ID: <uuid>. Include this in support requests for faster debugging.

Data freshness

Usage data may have a slight delay and is not real-time.