Cost Tracking

In plain terms

Every time an AI agent answers, it costs a small amount of money (you pay the model provider per word processed). Cost tracking is the meter and the spending limit — it counts every penny and can automatically stop an agent before it overspends. Why this matters: without it, a runaway agent or a heavy user can quietly run up a large bill. With it, you set a cap (“no more than $0.50 per user per month”) and the framework enforces it — like a prepaid card that simply stops working when the limit is hit.

The analogy: it’s the utility meter and the circuit breaker for your AI. You always know what you’re spending, and it cuts off before things get expensive.

Quick Start

import { Agent, openai, CostTracker } from "@agentium/core";

const tracker = new CostTracker({
  budget: {
    maxCostPerSession: 1.0,   // $1 per session
    maxCostPerUser: 10.0,     // $10 per user
    onBudgetExceeded: "throw",
  },
});

const agent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

await agent.run("Hello!", { sessionId: "s1", userId: "user-42" });

const summary = tracker.getSummary();
console.log(`Total cost: $${summary.totalCost.toFixed(4)}`);

Token Types Tracked

The CostTracker captures all token types returned by the API:

Token Type	Field	Description
Input	`promptTokens`	Tokens in the prompt / input
Output	`completionTokens`	Tokens generated in the response
Reasoning	`reasoningTokens`	Tokens used for chain-of-thought (o1, o3)
Cached	`cachedTokens`	Prompt tokens served from API cache
Audio Input	`audioInputTokens`	Tokens from audio input (Realtime API)
Audio Output	`audioOutputTokens`	Tokens for audio output (Realtime API)

All token types are tracked per-message, per-run, and per-session. The getSummary() method aggregates totals across all tracked dimensions.

Raw Provider Metrics

Every RunOutput.usage object includes a providerMetrics field containing the raw, unmodified usage data returned by the provider API. This gives full transparency without any normalization loss:

const result = await agent.run("Hello!");

// Normalized Agentium fields
console.log(result.usage.promptTokens);       // 16
console.log(result.usage.completionTokens);   // 10

// Raw provider-specific data (varies by provider)
console.log(result.usage.providerMetrics);

Example providerMetrics by provider:

{
  "prompt_tokens": 16,
  "completion_tokens": 10,
  "total_tokens": 26,
  "prompt_tokens_details": { "cached_tokens": 0 },
  "completion_tokens_details": { "reasoning_tokens": 0 }
}

Cost Breakdown

Each cost entry includes a 6-category breakdown:

interface CostBreakdown {
  input: number;       // Prompt token cost
  output: number;      // Completion token cost
  reasoning: number;   // Reasoning token cost
  cached: number;      // Cached prompt cost (discounted rate)
  audioInput: number;  // Audio input token cost
  audioOutput: number; // Audio output token cost
  total: number;       // Sum of all categories
}

Access per-run or aggregated:

// Per-entry breakdown
const entry = tracker.getEntries()[0];
console.log(entry.breakdown.input);    // $0.000375
console.log(entry.breakdown.output);   // $0.004956
console.log(entry.breakdown.total);    // $0.005331

// Aggregated summary
const summary = tracker.getSummary();
console.log(summary.totalBreakdown.total);       // Total across all entries
console.log(summary.byAgent["assistant"].breakdown); // Per-agent breakdown
console.log(summary.byModel["gpt-4o"].breakdown);   // Per-model breakdown
console.log(summary.byUser["user-42"].breakdown);    // Per-user breakdown

Built-in Pricing

Pricing is included for 50+ models:

Model	Prompt / 1K	Completion / 1K
gpt-4o	$0.0025	$0.01
gpt-4o-mini	$0.00015	$0.0006
claude-3.5-sonnet	$0.003	$0.015
gemini-2.0-flash	$0.0001	$0.0004

Override or extend pricing:

const tracker = new CostTracker({
  pricing: {
    "my-custom-model": {
      promptPer1k: 0.005,
      completionPer1k: 0.02,
      reasoningPer1k: 0.06,        // Optional: reasoning token pricing
      cachedPromptPer1k: 0.0005,   // Optional: cached prompt pricing
      audioInputPer1k: 0.1,        // Optional: audio input pricing
      audioOutputPer1k: 0.2,       // Optional: audio output pricing
    },
  },
});

Budget Enforcement

Budgets are checked before each LLM call and mid-run during tool-calling loops:

interface CostBudget {
  maxCostPerRun?: number;      // Per individual run
  maxCostPerSession?: number;  // Across all runs in a session
  maxCostPerUser?: number;     // Across all sessions for a user
  maxTokensPerRun?: number;    // Token limit per run
  onBudgetExceeded?: "throw" | "warn";
}

See Cost Auto-Stop for mid-run enforcement details.

Works Across All Agent Types

The same CostTracker instance can be shared across different agent types:

import { Agent, VoiceAgent, openai, openaiRealtime, CostTracker } from "@agentium/core";
import { BrowserAgent } from "@agentium/browser";

const tracker = new CostTracker({
  budget: { maxCostPerUser: 10.0 },
});

// Text agent
const textAgent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

// Voice agent
const voiceAgent = new VoiceAgent({
  name: "voice-assistant",
  provider: openaiRealtime("gpt-4o-realtime-preview"),
  costTracker: tracker,
});

// Browser agent
const browserAgent = new BrowserAgent({
  name: "web-scraper",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

// All three agents report to the same tracker
await textAgent.run("Hello!", { userId: "user-42" });
const session = await voiceAgent.connect({ userId: "user-42" });
await browserAgent.run("Search for flights", { userId: "user-42" });

const summary = tracker.getSummary();
console.log(summary.byAgent);
// { assistant: {...}, "voice-assistant": {...}, "web-scraper": {...} }
console.log(summary.byUser["user-42"].cost); // Combined cost across all agents

Cost Summary

const summary = tracker.getSummary({ userId: "user-42" });

summary.totalCost;              // Total USD
summary.totalTokens;            // Aggregated TokenUsage
summary.totalBreakdown;         // Aggregated CostBreakdown
summary.byAgent["assistant"];   // { cost, breakdown, tokens, runs }
summary.byModel["gpt-4o"];     // { cost, breakdown, tokens }
summary.byUser["user-42"];     // { cost, breakdown, tokens }

Events

Event	Payload
`cost.tracked`	`{ runId, agentName, modelId, usage, cost }`
`cost.budget.exceeded`	`{ runId, agentName, budget, current, limit }`

Subscribing to Cost Events

Listen for cost events to build dashboards, alerts, or analytics:

import { Agent, openai, CostTracker } from "@agentium/core";

const tracker = new CostTracker({
  budget: { maxCostPerSession: 2.0, onBudgetExceeded: "warn" },
});

const agent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

agent.on("cost.tracked", ({ runId, agentName, modelId, usage }) => {
  console.log(
    `[Cost] ${agentName} / ${modelId}: ` +
    `${usage.promptTokens} prompt + ${usage.completionTokens} completion`
  );
});

agent.on("cost.budget.exceeded", ({ agentName, budget, current, limit }) => {
  console.warn(
    `[Budget] ${agentName} exceeded ${budget}: $${current.toFixed(2)} / $${limit.toFixed(2)}`
  );
});

Per-Agent and Per-Model Breakdown

const tracker = new CostTracker();

const assistantAgent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

const routerAgent = new Agent({
  name: "router",
  model: openai("gpt-4o-mini"),
  costTracker: tracker,
});

await assistantAgent.run("Complex analysis task", { userId: "user-42" });
await routerAgent.run("Route this request", { userId: "user-42" });

const summary = tracker.getSummary();

// Per-agent costs with breakdowns
for (const [name, data] of Object.entries(summary.byAgent)) {
  console.log(`${name}: $${data.cost.toFixed(4)} (${data.runs} runs)`);
  console.log(`  input: $${data.breakdown.input.toFixed(6)}`);
  console.log(`  output: $${data.breakdown.output.toFixed(6)}`);
}

// Per-model costs
for (const [model, data] of Object.entries(summary.byModel)) {
  console.log(`${model}: $${data.cost.toFixed(4)}`);
}

// Per-user costs
for (const [user, data] of Object.entries(summary.byUser)) {
  console.log(`${user}: $${data.cost.toFixed(4)}`);
}

Custom Pricing for Non-Built-in Models

const tracker = new CostTracker({
  pricing: {
    "ft:gpt-4o-mini:my-org:custom:abc123": {
      promptPer1k: 0.0003,
      completionPer1k: 0.0012,
    },
    "llama3.1": {
      promptPer1k: 0,
      completionPer1k: 0,
    },
    "claude-opus-next": {
      promptPer1k: 0.015,
      completionPer1k: 0.075,
      reasoningPer1k: 0.06,
    },
  },
});

If a model has no pricing entry (built-in or custom), the cost is recorded as $0 but token counts are still tracked.

Budget Enforcement in Practice

const tracker = new CostTracker({
  budget: {
    maxCostPerRun: 0.50,
    maxCostPerSession: 2.00,
    maxCostPerUser: 20.00,
    maxTokensPerRun: 50_000,
    onBudgetExceeded: "throw",
  },
});

const agent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

try {
  await agent.run("Analyze this massive dataset...", {
    sessionId: "s1",
    userId: "user-42",
  });
} catch (error) {
  if (error.name === "CostBudgetExceededError") {
    console.log("Budget exceeded — inform the user or switch to a cheaper model");
  }
}

With onBudgetExceeded: "warn", the run continues but emits a cost.budget.exceeded event instead of throwing.

Token Accuracy

Agentium verifies 100% accuracy between CostTracker recorded tokens and raw API response tokens across all scenarios — simple completion, tool calling, multi-turn memory, and prompt caching. See benchmarks for detailed validation results.

Cross-References

Agent Overview — costTracker in AgentConfig
Voice Agents — costTracker in VoiceAgentConfig
Browser Agents — costTracker in BrowserAgentConfig
Cost Auto-Stop — Mid-run budget enforcement
Observability — Track costs alongside traces and metrics

​Cost Tracking

​In plain terms

​Quick Start

​Token Types Tracked

​Raw Provider Metrics

​Cost Breakdown

​Built-in Pricing

​Budget Enforcement

​Works Across All Agent Types

​Cost Summary

​Events

​Subscribing to Cost Events

​Per-Agent and Per-Model Breakdown

​Custom Pricing for Non-Built-in Models

​Budget Enforcement in Practice

​Token Accuracy

​Cross-References