Skip to main content
Debugging LLM interactions is essential for troubleshooting and optimizing your applications. Polyglot provides several layers of observability, from high-level event listeners to raw HTTP request inspection.

Wiretapping the Runtime

The simplest debugging path is to attach a wiretap listener to the InferenceRuntime. The wiretap receives every event dispatched during the request lifecycle, including request construction, driver selection, streaming deltas, and the final response.
<?php

use Cognesy\Messages\Messages;
use Cognesy\Polyglot\Inference\Config\LLMConfig;
use Cognesy\Polyglot\Inference\Inference;
use Cognesy\Polyglot\Inference\InferenceRuntime;

$runtime = InferenceRuntime::fromConfig(new LLMConfig(
    driver: 'openai',
    apiUrl: 'https://api.openai.com/v1',
    apiKey: (string) getenv('OPENAI_API_KEY'),
    endpoint: '/chat/completions',
    model: 'gpt-4.1-nano',
))->wiretap(function ($event): void {
    echo get_class($event) . PHP_EOL;
});

$text = Inference::fromRuntime($runtime)
    ->withMessages(Messages::fromString('Say hello.'))
    ->get();
// @doctest id="3028"
This prints every event class name as it fires, giving you an immediate view of the request flow without modifying your application code.

Listening for Specific Events

When you only care about certain events, use onEvent() to register targeted listeners instead of a wiretap. This avoids noise from events you do not need.
<?php

use Cognesy\Events\Dispatchers\EventDispatcher;
use Cognesy\Messages\Messages;
use Cognesy\Polyglot\Inference\Config\LLMConfig;
use Cognesy\Polyglot\Inference\Events\InferenceRequested;
use Cognesy\Polyglot\Inference\Events\InferenceResponseCreated;
use Cognesy\Polyglot\Inference\Inference;
use Cognesy\Polyglot\Inference\InferenceRuntime;

$runtime = InferenceRuntime::fromConfig(
    LLMConfig::fromPreset('openai'),
);

$runtime->onEvent(
    InferenceRequested::class,
    function (InferenceRequested $event): void {
        echo "Request sent to model\n";
    },
);

$runtime->onEvent(
    InferenceResponseCreated::class,
    function (InferenceResponseCreated $event): void {
        echo "Response received\n";
    },
);

$text = Inference::fromRuntime($runtime)
    ->withMessages(Messages::fromString('What is the capital of France?'))
    ->get();
// @doctest id="97e6"

Available Events

Polyglot dispatches events at each stage of the inference lifecycle:
EventWhen it fires
InferenceStartedBefore the first attempt begins
InferenceRequestedWhen a request is about to be sent
InferenceAttemptStartedAt the start of each retry attempt
InferenceAttemptSucceededWhen an attempt receives a successful response
InferenceAttemptFailedWhen an attempt fails (before retry)
StreamEventReceivedWhen a raw SSE event arrives during streaming
StreamEventParsedAfter a stream event is parsed into a delta
StreamFirstChunkReceivedWhen the first visible delta arrives (useful for TTFC)
PartialInferenceDeltaCreatedFor each visible streaming delta
InferenceResponseCreatedWhen the final response is assembled
InferenceCompletedAfter the entire inference flow finishes
InferenceFailedWhen all retry attempts are exhausted
InferenceUsageReportedWhen token usage data is available
InferenceDriverBuiltWhen the driver is constructed (includes redacted config)

Logging to Files

For persistent debugging, write event data to a log file:
<?php

use Cognesy\Messages\Messages;
use Cognesy\Polyglot\Inference\Config\LLMConfig;
use Cognesy\Polyglot\Inference\Events\InferenceRequested;
use Cognesy\Polyglot\Inference\Events\InferenceResponseCreated;
use Cognesy\Polyglot\Inference\Inference;
use Cognesy\Polyglot\Inference\InferenceRuntime;

function logToFile(string $message, string $filename = 'llm_debug.log'): void {
    $timestamp = date('Y-m-d H:i:s');
    file_put_contents(
        $filename,
        "[$timestamp] $message" . PHP_EOL,
        FILE_APPEND,
    );
}

$runtime = InferenceRuntime::fromConfig(LLMConfig::fromPreset('openai'));

$runtime->onEvent(
    InferenceRequested::class,
    function (InferenceRequested $event): void {
        logToFile("REQUEST: " . json_encode($event->data));
    },
);

$runtime->onEvent(
    InferenceResponseCreated::class,
    function (InferenceResponseCreated $event): void {
        logToFile("RESPONSE: " . json_encode($event->data));
    },
);

$text = Inference::fromRuntime($runtime)
    ->withMessages(Messages::fromString('What is artificial intelligence?'))
    ->get();
// @doctest id="d661"

HTTP-Level Inspection

If you need to see the raw HTTP request and response bodies, inject a custom HTTP client with middleware. This is useful when you suspect Polyglot is sending an unexpected payload, or when the provider returns an error body that higher-level events do not surface.
<?php

use Cognesy\Http\Config\HttpClientConfig;
use Cognesy\Http\HttpClient;
use Cognesy\Messages\Messages;
use Cognesy\Polyglot\Inference\Config\LLMConfig;
use Cognesy\Polyglot\Inference\Inference;
use Cognesy\Polyglot\Inference\InferenceRuntime;

$httpClient = HttpClient::fromConfig(new HttpClientConfig(
    connectTimeout: 10,
    requestTimeout: 60,
));

$runtime = InferenceRuntime::fromConfig(
    config: LLMConfig::fromPreset('openai'),
    httpClient: $httpClient,
);

$text = Inference::fromRuntime($runtime)
    ->withMessages(Messages::fromString('Test message'))
    ->get();
// @doctest id="b07b"
You can add custom middleware to the HTTP client using withMiddleware() to log, transform, or inspect requests and responses at the transport layer. This is especially helpful when working behind proxies, or when provider error messages are only visible in the raw HTTP body.

Tips for Effective Debugging

  • Start with wiretap. It gives a complete picture with no configuration.
  • Narrow to specific events once you know which stage of the flow is failing.
  • Check the InferenceDriverBuilt event to confirm the correct driver and configuration were resolved. The config is automatically redacted to hide API keys.
  • Use file logging in production rather than echo, so you can review logs after the fact.
  • For streaming issues, listen for StreamFirstChunkReceived to measure time-to-first-chunk, and PartialInferenceDeltaCreated to verify deltas are arriving.