Usage API - Sail Research

The Usage API lets you query your Sail usage metrics from scripts, CLIs, or dashboards. It uses the same API key you use for api.sailresearch.com.

Base URL

https://usage.sailresearch.com

Authentication

All requests require a Bearer API key in the Authorization header.

This is the same API key you use for the inference API at api.sailresearch.com. Create or manage keys from the Sail dashboard.

import requests

headers = {"Authorization": "Bearer YOUR_SAIL_API_KEY"}
resp = requests.get("https://usage.sailresearch.com/v1/usage", headers=headers)
print(resp.json())

GET /v1/usage

Returns a spending and balance summary for the requested time range.

Parameter	Values	Default	Description
`range`	`24h`, `7d`, `30d`, `period`	`30d`	Time window. `period` = current billing period.

curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
  "https://usage.sailresearch.com/v1/usage?range=7d" | jq

{
  "object": "usage.summary",
  "range": "7d",
  "period_spend": 5432,
  "burn_rate": 776,
  "balance": 10000,
  "balance_unavailable": false,
  "days_remaining": 12,
  "plan_type": "self_serve",
  "model_count": 3,
  "tokens": {
    "total": 1000000,
    "input": 600000,
    "output": 350000,
    "cached": 50000
  },
  "avg_cost_per_day": 776,
  "sla_mix": {
    "priority": 0.7,
    "standard": 0.3
  },
  "prior_period": {
    "period_spend": 4000,
    "model_count": 2,
    "tokens": {
      "total": 800000,
      "input": 500000,
      "output": 280000,
      "cached": 20000
    },
    "avg_cost_per_day": 571
  }
}

All monetary values are in cents. prior_period covers the equivalent window immediately before the requested range, for comparison.

sla_mix keys are the completion window names: asap, priority, standard, flex. Historical entries from before the canonical rate-card promotion may also appear with the legacy keys 15min (now priority) or 24hr (now flex).

GET /v1/usage/breakdown

Returns time-series spend data and per-model ranking. Use this for charting.

Parameter	Values	Default	Description
`range`	`24h`, `7d`, `30d`, `period`, `day`	`30d`	Time window.
`date`	`YYYY-MM-DD`	—	Required when `range=day`. Drills into hourly data for that date.

curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
  "https://usage.sailresearch.com/v1/usage/breakdown?range=7d" | jq

{
  "object": "usage.breakdown",
  "range": "7d",
  "granularity": "day",
  "data": [
    {
      "timestamp": "2025-01-15",
      "total": 800,
      "models": {
        "zai-org/GLM-5.1-FP8": {
          "total": 500,
          "tokens": 50000,
          "input_tokens": 30000,
          "output_tokens": 15000,
          "cached_tokens": 5000
        }
      },
      "slas": { "priority": 800 }
    }
  ],
  "models": [
    {
      "model": "zai-org/GLM-5.1-FP8",
      "total": 3500,
      "tokens": 350000,
      "input_tokens": 210000,
      "output_tokens": 105000,
      "cached_tokens": 35000,
      "slas": { "priority": 3500 },
      "percentage": 0.65
    }
  ]
}

granularity is "day" for 7d/30d/period and "hour" for 24h or day drill-down.

GET /v1/usage/activity

Returns operational metrics: request counts, token throughput, latency, and recent requests.

Parameter	Values	Default	Description
`limit`	`1`–`100`	`10`	Number of recent requests to return.

curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
  https://usage.sailresearch.com/v1/usage/activity | jq

{
  "object": "usage.activity",
  "available": true,
  "requests": {
    "last_1m": 5,
    "last_1h": 120,
    "last_24h": 2000,
    "last_7d": 14000
  },
  "tokens": {
    "last_1m": 500,
    "last_1h": 12000,
    "last_24h": 200000,
    "last_7d": 1400000,
    "token_breakdown_1h": {
      "input": 8000,
      "output": 3500,
      "cached": 500
    }
  },
  "latency": {
    "avg_1m_ms": 1200,
    "avg_1h_ms": 1500
  },
  "recent_requests": [
    {
      "response_id": "resp_abc123",
      "model": "zai-org/GLM-5.1-FP8",
      "sla": "priority",
      "status": "completed",
      "created_at": "2025-01-15T10:00:00Z",
      "updated_at": "2025-01-15T10:01:00Z"
    }
  ],
  "has_more": false
}

If has_more is true, there are additional recent requests beyond the requested limit.

Activity data may have a slight delay and is not a real-time stream.

GET /v1/usage/activity/timeseries

Returns bucketed time-series data for requests or tokens. Use this for throughput charts.

Parameter	Values	Default	Description
`type`	`requests`, `tokens`	`requests`	What to chart.
`range`	`1h`, `6h`, `24h`	`24h`	Time window. Bucket size: 1min / 5min / 1 hour respectively.

Requests by model:

curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
  "https://usage.sailresearch.com/v1/usage/activity/timeseries?type=requests&range=1h" | jq

{
  "object": "usage.activity.timeseries",
  "type": "requests",
  "range": "1h",
  "available": true,
  "series": [
    {
      "time_bucket": "2025-01-15T10:00:00Z",
      "model": "zai-org/GLM-5.1-FP8",
      "count": 5
    },
    {
      "time_bucket": "2025-01-15T10:01:00Z",
      "model": "zai-org/GLM-5.1-FP8",
      "count": 3
    }
  ]
}

Token breakdown:

curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
  "https://usage.sailresearch.com/v1/usage/activity/timeseries?type=tokens&range=24h" | jq

{
  "object": "usage.activity.timeseries",
  "type": "tokens",
  "range": "24h",
  "available": true,
  "series": [
    {
      "time_bucket": "2025-01-15T10:00:00Z",
      "total_tokens": 1000,
      "input_tokens": 600,
      "output_tokens": 350,
      "cached_tokens": 50
    }
  ]
}

Errors

All errors follow the same format used by the inference API:

{
  "error": {
    "message": "Missing or invalid Authorization header",
    "type": "authentication_error",
    "param": null,
    "code": null
  }
}

HTTP Status	`type`	When
400	`invalid_request_error`	Bad query parameter, missing required param
400	`idempotency_error`	Idempotency key reused with a different request body
401	`authentication_error`	Missing, invalid, or expired API key
402	`billing_error`	Key disabled due to insufficient credits
429	`rate_limit_error`	Rate limit exceeded (includes `Retry-After` header)
500	`api_error`	Internal server error

Response headers

All responses include X-Request-ID: <uuid>. Include this in support requests for faster debugging.

Data freshness

Usage data may have a slight delay and is not real-time.

Documentation Index

​Base URL

​Authentication

​GET /v1/usage

​GET /v1/usage/breakdown

​GET /v1/usage/activity

​GET /v1/usage/activity/timeseries

​Errors

​Response headers

​Data freshness

Base URL

Authentication

GET /v1/usage

GET /v1/usage/breakdown

GET /v1/usage/activity

GET /v1/usage/activity/timeseries

Errors

Response headers

Data freshness