Agent Cost Calculator

You can estimate the inference costs of running your agent by picking a workload profile below, or set your own per-turn token mix.

The interactive calculator needs JavaScript, which your browser doesn’t support. Here is a representative worked example instead: a deep-research agent that runs 50 turns and totals 4.0M cached input reads, 1.5M fresh input tokens, and 250K output tokens, priced on Sail’s zai-org/GLM-5.2-FP8 and on generic frontier-class tiers at list prices (Sonnet-class: $3 input / $15 output / $0.30 cache reads per 1M tokens; Opus-class: $5 / $25 / $0.50).

Where it runs	Cost	vs Sail `standard`
Sail `flex`	$1.41	0.8x
Sail `standard`	$1.86	1x
Sail `priority`	$2.51	1.3x
Sail `asap`	$4.42	2.4x
Sonnet-class API, batch tier (24h window)	$5.18	2.8x
Sonnet-class API	$10.35	5.5x
Opus-class API	$17.25	9.2x

How the math works

cost per run = turns × (fresh × P_input + cached × P_cached + output × P_output) / 1,000,000

Fresh input: tokens the model reads for the first time each turn (new tool results, search snippets, file contents).
Cached input: tokens reread from prompt cache (the growing conversation history). Cache reads are billed at the cached rate.
Output: tokens the model writes (reasoning and answers).

Sail rates come from the Pricing page for the selected model and completion window; daily and monthly figures multiply by runs per day and a 30-day month.

Assumptions and caveats

Frontier tiers are generic list prices. “Sonnet-class” is $3 input / $0.30 cache reads / $15 output per 1M tokens (batch tier = 50% off inside a 24-hour window); “Opus-class” is $5 / $0.50 / $25.
The “same model, traditional provider” row is the model’s ASAP price. Sail sets its ASAP (always-on) rate to match what a traditional always-on provider charges for the same model, so it doubles as the “running this elsewhere” baseline, and the scheduled completion windows (flex, standard, priority) price below it. Models served only on ASAP show the row at ~1x; models with no ASAP tier use a fixed traditional-provider list price.
The model has to do the job. For simplicity, the math assumes you’re using a single frontier-class open model for your task. Often, we see the hybrid approach using both frontier closed models and open models, or a mix of open models.

Getting started

Inference

Guides

Sailbox

Voyages

How the math works

Assumptions and caveats

​How the math works

​Assumptions and caveats

How the math works

Assumptions and caveats