Documentation Index Fetch the complete documentation index at: https://docs.xhipai.com/llms.txt
Use this file to discover all available pages before exploring further.
Cost Tracking
The CostTracker tracks token usage and costs across agent runs. It supports budget enforcement, per-model pricing, detailed cost breakdowns by category, and per-agent/model/user summaries.
Quick Start
import { Agent , openai , CostTracker } from "@agentium/core" ;
const tracker = new CostTracker ({
budget: {
maxCostPerSession: 1.0 , // $1 per session
maxCostPerUser: 10.0 , // $10 per user
onBudgetExceeded: "throw" ,
},
});
const agent = new Agent ({
name: "assistant" ,
model: openai ( "gpt-4o" ),
costTracker: tracker ,
});
await agent . run ( "Hello!" , { sessionId: "s1" , userId: "user-42" });
const summary = tracker . getSummary ();
console . log ( `Total cost: $ ${ summary . totalCost . toFixed ( 4 ) } ` );
Token Types Tracked
The CostTracker captures all token types returned by the API:
Token Type Field Description Input promptTokensTokens in the prompt / input Output completionTokensTokens generated in the response Reasoning reasoningTokensTokens used for chain-of-thought (o1, o3) Cached cachedTokensPrompt tokens served from API cache Audio Input audioInputTokensTokens from audio input (Realtime API) Audio Output audioOutputTokensTokens for audio output (Realtime API)
All token types are tracked per-message, per-run, and per-session. The getSummary() method aggregates totals across all tracked dimensions.
Raw Provider Metrics
Every RunOutput.usage object includes a providerMetrics field containing the raw, unmodified usage data returned by the provider API. This gives full transparency without any normalization loss:
const result = await agent . run ( "Hello!" );
// Normalized Agentium fields
console . log ( result . usage . promptTokens ); // 16
console . log ( result . usage . completionTokens ); // 10
// Raw provider-specific data (varies by provider)
console . log ( result . usage . providerMetrics );
Example providerMetrics by provider:
OpenAI
Google / Vertex AI
Anthropic
AWS Bedrock
Ollama
{
"prompt_tokens" : 16 ,
"completion_tokens" : 10 ,
"total_tokens" : 26 ,
"prompt_tokens_details" : { "cached_tokens" : 0 },
"completion_tokens_details" : { "reasoning_tokens" : 0 }
}
Cost Breakdown
Each cost entry includes a 6-category breakdown:
interface CostBreakdown {
input : number ; // Prompt token cost
output : number ; // Completion token cost
reasoning : number ; // Reasoning token cost
cached : number ; // Cached prompt cost (discounted rate)
audioInput : number ; // Audio input token cost
audioOutput : number ; // Audio output token cost
total : number ; // Sum of all categories
}
Access per-run or aggregated:
// Per-entry breakdown
const entry = tracker . getEntries ()[ 0 ];
console . log ( entry . breakdown . input ); // $0.000375
console . log ( entry . breakdown . output ); // $0.004956
console . log ( entry . breakdown . total ); // $0.005331
// Aggregated summary
const summary = tracker . getSummary ();
console . log ( summary . totalBreakdown . total ); // Total across all entries
console . log ( summary . byAgent [ "assistant" ]. breakdown ); // Per-agent breakdown
console . log ( summary . byModel [ "gpt-4o" ]. breakdown ); // Per-model breakdown
console . log ( summary . byUser [ "user-42" ]. breakdown ); // Per-user breakdown
Built-in Pricing
Pricing is included for 50+ models:
Model Prompt / 1K Completion / 1K gpt-4o $0.0025 $0.01 gpt-4o-mini $0.00015 $0.0006 claude-3.5-sonnet $0.003 $0.015 gemini-2.0-flash $0.0001 $0.0004
Override or extend pricing:
const tracker = new CostTracker ({
pricing: {
"my-custom-model" : {
promptPer1k: 0.005 ,
completionPer1k: 0.02 ,
reasoningPer1k: 0.06 , // Optional: reasoning token pricing
cachedPromptPer1k: 0.0005 , // Optional: cached prompt pricing
audioInputPer1k: 0.1 , // Optional: audio input pricing
audioOutputPer1k: 0.2 , // Optional: audio output pricing
},
},
});
Budget Enforcement
Budgets are checked before each LLM call and mid-run during tool-calling loops:
interface CostBudget {
maxCostPerRun ?: number ; // Per individual run
maxCostPerSession ?: number ; // Across all runs in a session
maxCostPerUser ?: number ; // Across all sessions for a user
maxTokensPerRun ?: number ; // Token limit per run
onBudgetExceeded ?: "throw" | "warn" ;
}
See Cost Auto-Stop for mid-run enforcement details.
Works Across All Agent Types
The same CostTracker instance can be shared across different agent types:
import { Agent , VoiceAgent , openai , openaiRealtime , CostTracker } from "@agentium/core" ;
import { BrowserAgent } from "@agentium/browser" ;
const tracker = new CostTracker ({
budget: { maxCostPerUser: 10.0 },
});
// Text agent
const textAgent = new Agent ({
name: "assistant" ,
model: openai ( "gpt-4o" ),
costTracker: tracker ,
});
// Voice agent
const voiceAgent = new VoiceAgent ({
name: "voice-assistant" ,
provider: openaiRealtime ( "gpt-4o-realtime-preview" ),
costTracker: tracker ,
});
// Browser agent
const browserAgent = new BrowserAgent ({
name: "web-scraper" ,
model: openai ( "gpt-4o" ),
costTracker: tracker ,
});
// All three agents report to the same tracker
await textAgent . run ( "Hello!" , { userId: "user-42" });
const session = await voiceAgent . connect ({ userId: "user-42" });
await browserAgent . run ( "Search for flights" , { userId: "user-42" });
const summary = tracker . getSummary ();
console . log ( summary . byAgent );
// { assistant: {...}, "voice-assistant": {...}, "web-scraper": {...} }
console . log ( summary . byUser [ "user-42" ]. cost ); // Combined cost across all agents
Cost Summary
const summary = tracker . getSummary ({ userId: "user-42" });
summary . totalCost ; // Total USD
summary . totalTokens ; // Aggregated TokenUsage
summary . totalBreakdown ; // Aggregated CostBreakdown
summary . byAgent [ "assistant" ]; // { cost, breakdown, tokens, runs }
summary . byModel [ "gpt-4o" ]; // { cost, breakdown, tokens }
summary . byUser [ "user-42" ]; // { cost, breakdown, tokens }
Events
Event Payload cost.tracked{ runId, agentName, modelId, usage, cost }cost.budget.exceeded{ runId, agentName, budget, current, limit }
Subscribing to Cost Events
Listen for cost events to build dashboards, alerts, or analytics:
import { Agent , openai , CostTracker } from "@agentium/core" ;
const tracker = new CostTracker ({
budget: { maxCostPerSession: 2.0 , onBudgetExceeded: "warn" },
});
const agent = new Agent ({
name: "assistant" ,
model: openai ( "gpt-4o" ),
costTracker: tracker ,
});
agent . on ( "cost.tracked" , ({ runId , agentName , modelId , usage }) => {
console . log (
`[Cost] ${ agentName } / ${ modelId } : ` +
` ${ usage . promptTokens } prompt + ${ usage . completionTokens } completion`
);
});
agent . on ( "cost.budget.exceeded" , ({ agentName , budget , current , limit }) => {
console . warn (
`[Budget] ${ agentName } exceeded ${ budget } : $ ${ current . toFixed ( 2 ) } / $ ${ limit . toFixed ( 2 ) } `
);
});
Per-Agent and Per-Model Breakdown
const tracker = new CostTracker ();
const assistantAgent = new Agent ({
name: "assistant" ,
model: openai ( "gpt-4o" ),
costTracker: tracker ,
});
const routerAgent = new Agent ({
name: "router" ,
model: openai ( "gpt-4o-mini" ),
costTracker: tracker ,
});
await assistantAgent . run ( "Complex analysis task" , { userId: "user-42" });
await routerAgent . run ( "Route this request" , { userId: "user-42" });
const summary = tracker . getSummary ();
// Per-agent costs with breakdowns
for ( const [ name , data ] of Object . entries ( summary . byAgent )) {
console . log ( ` ${ name } : $ ${ data . cost . toFixed ( 4 ) } ( ${ data . runs } runs)` );
console . log ( ` input: $ ${ data . breakdown . input . toFixed ( 6 ) } ` );
console . log ( ` output: $ ${ data . breakdown . output . toFixed ( 6 ) } ` );
}
// Per-model costs
for ( const [ model , data ] of Object . entries ( summary . byModel )) {
console . log ( ` ${ model } : $ ${ data . cost . toFixed ( 4 ) } ` );
}
// Per-user costs
for ( const [ user , data ] of Object . entries ( summary . byUser )) {
console . log ( ` ${ user } : $ ${ data . cost . toFixed ( 4 ) } ` );
}
Custom Pricing for Non-Built-in Models
const tracker = new CostTracker ({
pricing: {
"ft:gpt-4o-mini:my-org:custom:abc123" : {
promptPer1k: 0.0003 ,
completionPer1k: 0.0012 ,
},
"llama3.1" : {
promptPer1k: 0 ,
completionPer1k: 0 ,
},
"claude-opus-next" : {
promptPer1k: 0.015 ,
completionPer1k: 0.075 ,
reasoningPer1k: 0.06 ,
},
},
});
If a model has no pricing entry (built-in or custom), the cost is recorded as $0 but token counts are still tracked.
Budget Enforcement in Practice
const tracker = new CostTracker ({
budget: {
maxCostPerRun: 0.50 ,
maxCostPerSession: 2.00 ,
maxCostPerUser: 20.00 ,
maxTokensPerRun: 50_000 ,
onBudgetExceeded: "throw" ,
},
});
const agent = new Agent ({
name: "assistant" ,
model: openai ( "gpt-4o" ),
costTracker: tracker ,
});
try {
await agent . run ( "Analyze this massive dataset..." , {
sessionId: "s1" ,
userId: "user-42" ,
});
} catch ( error ) {
if ( error . name === "CostBudgetExceededError" ) {
console . log ( "Budget exceeded — inform the user or switch to a cheaper model" );
}
}
With onBudgetExceeded: "warn", the run continues but emits a cost.budget.exceeded event instead of throwing.
Token Accuracy
Agentium verifies 100% accuracy between CostTracker recorded tokens and raw API response tokens across all scenarios — simple completion, tool calling, multi-turn memory, and prompt caching. See benchmarks for detailed validation results.
Cross-References