Documentation Index
Fetch the complete documentation index at: https://docs.xhipai.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
ReliabilityEval asserts that agents call expected tools, handle errors correctly, and produce non-empty responses.
Quick Start
import { ReliabilityEval } from "@agentium/eval";
import { Agent, openai } from "@agentium/core";
const agent = new Agent({
name: "tool-agent",
model: openai("gpt-4o"),
tools: [searchTool, calcTool],
});
const eval = new ReliabilityEval({
name: "tool-reliability",
agent,
cases: [
{ name: "uses-search", input: "Search for latest news", expectedTools: ["search"] },
{ name: "handles-error", input: "Divide by zero", shouldError: true },
],
});
const result = await eval.run();
Case Options
| Field | Type | Description |
|---|
expectedTools | string[] | Tool names that should be called |
shouldError | boolean | Whether the case should throw an error |
Use toolCallMatch as a standalone scorer:
import { EvalSuite, toolCallMatch } from "@agentium/eval";
const suite = new EvalSuite({
name: "tools-test",
agent,
scorers: [toolCallMatch(["search", "calculate"])],
cases: [{ name: "test", input: "Search and calculate" }],
});