GET /v1/usage
Returns a spending and balance summary for the requested time range.
| Parameter | Values | Default | Description |
|---|
range | 24h, 7d, 30d, period | 30d | Time window. period = current billing period. |
curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
"https://usage.sailresearch.com/v1/usage?range=7d" | jq
{
"object": "usage.summary",
"range": "7d",
"period_spend": 5432,
"burn_rate": 776,
"balance": 10000,
"balance_unavailable": false,
"days_remaining": 12,
"plan_type": "self_serve",
"model_count": 3,
"tokens": {
"total": 1000000,
"input": 600000,
"output": 350000,
"cached": 50000
},
"avg_cost_per_day": 776,
"sla_mix": {
"priority": 0.7,
"standard": 0.3
},
"prior_period": {
"period_spend": 4000,
"model_count": 2,
"tokens": {
"total": 800000,
"input": 500000,
"output": 280000,
"cached": 20000
},
"avg_cost_per_day": 571
}
}
All monetary values are in cents. prior_period covers the equivalent
window immediately before the requested range, for comparison.
sla_mix keys are the completion window names: asap,
priority, standard, flex. Historical entries from before the canonical
rate-card promotion may also appear with the legacy keys 15min (now
priority) or 24hr (now flex).
GET /v1/usage/breakdown
Returns time-series spend data and per-model ranking. Use this for charting.
| Parameter | Values | Default | Description |
|---|
range | 24h, 7d, 30d, period, day | 30d | Time window. |
date | YYYY-MM-DD | — | Required when range=day. Drills into hourly data for that date. |
curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
"https://usage.sailresearch.com/v1/usage/breakdown?range=7d" | jq
{
"object": "usage.breakdown",
"range": "7d",
"granularity": "day",
"data": [
{
"timestamp": "2025-01-15",
"total": 800,
"models": {
"zai-org/GLM-5.1-FP8": {
"total": 500,
"tokens": 50000,
"input_tokens": 30000,
"output_tokens": 15000,
"cached_tokens": 5000
}
},
"slas": { "priority": 800 }
}
],
"models": [
{
"model": "zai-org/GLM-5.1-FP8",
"total": 3500,
"tokens": 350000,
"input_tokens": 210000,
"output_tokens": 105000,
"cached_tokens": 35000,
"slas": { "priority": 3500 },
"percentage": 0.65
}
]
}
granularity is "day" for 7d/30d/period and "hour" for 24h or day drill-down.
GET /v1/usage/activity
Returns operational metrics: request counts, token throughput, latency, and recent requests.
| Parameter | Values | Default | Description |
|---|
limit | 1–100 | 10 | Number of recent requests to return. |
curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
https://usage.sailresearch.com/v1/usage/activity | jq
{
"object": "usage.activity",
"available": true,
"requests": {
"last_1m": 5,
"last_1h": 120,
"last_24h": 2000,
"last_7d": 14000
},
"tokens": {
"last_1m": 500,
"last_1h": 12000,
"last_24h": 200000,
"last_7d": 1400000,
"token_breakdown_1h": {
"input": 8000,
"output": 3500,
"cached": 500
}
},
"latency": {
"avg_1m_ms": 1200,
"avg_1h_ms": 1500
},
"recent_requests": [
{
"response_id": "resp_abc123",
"model": "zai-org/GLM-5.1-FP8",
"sla": "priority",
"status": "completed",
"created_at": "2025-01-15T10:00:00Z",
"updated_at": "2025-01-15T10:01:00Z"
}
],
"has_more": false
}
If has_more is true, there are additional recent requests beyond the requested limit.
Activity data may have a slight delay and is not a real-time stream.
GET /v1/usage/activity/timeseries
Returns bucketed time-series data for requests or tokens. Use this for throughput charts.
| Parameter | Values | Default | Description |
|---|
type | requests, tokens | requests | What to chart. |
range | 1h, 6h, 24h | 24h | Time window. Bucket size: 1min / 5min / 1 hour respectively. |
Requests by model:
curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
"https://usage.sailresearch.com/v1/usage/activity/timeseries?type=requests&range=1h" | jq
{
"object": "usage.activity.timeseries",
"type": "requests",
"range": "1h",
"available": true,
"series": [
{
"time_bucket": "2025-01-15T10:00:00Z",
"model": "zai-org/GLM-5.1-FP8",
"count": 5
},
{
"time_bucket": "2025-01-15T10:01:00Z",
"model": "zai-org/GLM-5.1-FP8",
"count": 3
}
]
}
Token breakdown:
curl -s -H "Authorization: Bearer $SAIL_API_KEY" \
"https://usage.sailresearch.com/v1/usage/activity/timeseries?type=tokens&range=24h" | jq
{
"object": "usage.activity.timeseries",
"type": "tokens",
"range": "24h",
"available": true,
"series": [
{
"time_bucket": "2025-01-15T10:00:00Z",
"total_tokens": 1000,
"input_tokens": 600,
"output_tokens": 350,
"cached_tokens": 50
}
]
}
Errors
All errors follow the same format used by the inference API:
{
"error": {
"message": "Missing or invalid Authorization header",
"type": "authentication_error",
"param": null,
"code": null
}
}
| HTTP Status | type | When |
|---|
| 400 | invalid_request_error | Bad query parameter, missing required param |
| 400 | idempotency_error | Idempotency key reused with a different request body |
| 401 | authentication_error | Missing, invalid, or expired API key |
| 402 | billing_error | Key disabled due to insufficient credits |
| 429 | rate_limit_error | Rate limit exceeded (includes Retry-After header) |
| 500 | api_error | Internal server error |
All responses include X-Request-ID: <uuid>. Include this in support requests for faster debugging.