Testing Agents
The Agents package ships with first-class testing primitives that let you exercise agent behavior without making real LLM calls. By combiningFakeAgentDriver with FakeTool, you can script deterministic scenarios, assert on individual steps, and verify that your agent’s tool-calling logic, error handling, and multi-step loops behave exactly as expected.
FakeAgentDriver
FakeAgentDriver replaces a real driver (such as ToolCallingDriver or ReActDriver) with a scripted sequence of steps. Each step defines what the “LLM” would return — a final response, a tool call, an intermediate message, or an error. The driver advances through the script one step per loop iteration, giving you full control over the agent’s behavior.
Creating a Driver from Simple Responses
The simplest way to create a fake driver is from one or more string responses. Each string becomes aFinalResponse step, meaning the agent loop will stop after consuming it:
Creating a Driver from Scenario Steps
For more sophisticated tests — especially those involving tool calls — useScenarioStep objects directly:
ScenarioStep Types
ScenarioStep is a readonly value object that describes what a single agent loop iteration should produce. Four factory methods cover the common cases:
ScenarioStep::final()
Produces a FinalResponse step. The agent loop recognizes this as a terminal response and stops iterating:
ScenarioStep::tool()
Produces a ToolExecution step type without attaching any tool calls. This is useful for simulating intermediate LLM responses that signal the loop should continue (because the step type is ToolExecution), but where no actual tool invocation is needed:
Note: Because no tool calls are attached, this step will not trigger theToolExecutor. If you need actual tool execution, useScenarioStep::toolCall()instead.
ScenarioStep::error()
Produces an Error step. The step is created with a RuntimeException attached, which the agent loop treats as a failure:
ScenarioStep::toolCall()
Produces a ToolExecution step with a tool call attached. This is the most powerful step type — it simulates the LLM requesting a specific tool and optionally executes it through the ToolExecutor:
executeTools is true, the tool call is forwarded to the ToolExecutor, which resolves the tool from the Tools collection and invokes it. Set it to false to skip execution entirely — useful when you only need to verify that the correct tool call was produced.
Custom Usage Tracking
All step factories accept an optionalInferenceUsage parameter for testing token-budget guards or usage reporting:
FakeTool
FakeTool creates tool stubs that implement both ToolInterface and CanDescribeTool. They can be registered in a Tools collection and will be resolved by the ToolExecutor when a matching tool call arrives.
Fixed Return Value
The simplest mock returns the same value regardless of the arguments passed:Custom Logic
For more realistic stubs, pass a callable that receives the tool’s arguments and returns a result:Result::from() automatically.
Custom Schema and Metadata
When you need the fake to advertise a specific JSON Schema (for example, to test schema validation), pass theschema parameter:
Full Test Example
The following Pest test demonstrates the complete pattern: create aFakeTool, script a FakeAgentDriver with a tool-call step followed by a final-response step, wire them into an AgentLoop, and assert on the result:
Using iterate() for Step-Level Testing
The AgentLoop::iterate() method returns a generator that yields the AgentState after each completed step. This gives you fine-grained visibility into intermediate states — useful for asserting on tool execution ordering, intermediate messages, or guard behavior:
Tip: The final state yielded byiterate()includes thewithExecutionCompleted()transition, so you can also assert onExecutionStatusand total usage.
Testing Error Handling
You can verify that your agent handles errors gracefully by scripting error steps:Testing Subagent Scenarios
FakeAgentDriver supports child steps for subagent testing. When the driver is cloned for a subagent (via withLLMProvider() or withLLMConfig()), it uses the child steps instead of the parent steps:
ScenarioStep::final('ok').
Testing with Events
You can attach event listeners to theAgentLoop to capture events emitted during execution. This is useful for verifying that specific lifecycle events fire at the right time:
Summary
| Component | Purpose |
|---|---|
FakeAgentDriver | Replaces the LLM driver with a scripted sequence of steps |
ScenarioStep | Describes a single loop iteration (final, tool, error, or toolCall) |
FakeTool | Stubs a tool with a fixed return value or custom callable |
AgentLoop::iterate() | Yields state after each step for fine-grained assertions |
withChildSteps() | Scripts subagent behavior when using FakeAgentDriver |