Skip to main content
Polyglot uses an event system to provide observability into the internal execution pipeline. Events are dispatched at each stage of the lifecycle, making it straightforward to implement logging, metrics, debugging, and monitoring without modifying the core library.

Listening to Events

Both inference and embeddings runtimes expose two ways to listen to events:

Targeted Listeners

Use onEvent() to listen for a specific event class:
use Cognesy\Polyglot\Inference\Config\LLMConfig;
use Cognesy\Polyglot\Inference\Events\InferenceResponseCreated;
use Cognesy\Polyglot\Inference\InferenceRuntime;

$runtime = InferenceRuntime::fromConfig(
    new LLMConfig(
        driver: 'openai',
        apiUrl: 'https://api.openai.com/v1',
        apiKey: getenv('OPENAI_API_KEY'),
        endpoint: '/chat/completions',
        model: 'gpt-4.1-nano',
    ),
)->onEvent(InferenceResponseCreated::class, function ($event): void {
    // Log or inspect the response
});
// @doctest id="1154"
You can register multiple listeners for the same event class. An optional priority parameter controls the order (higher values run first):
$runtime->onEvent(InferenceStarted::class, $highPriorityListener, priority: 10);
$runtime->onEvent(InferenceStarted::class, $lowPriorityListener, priority: 0);
// @doctest id="f5aa"

Wiretap

Use wiretap() to receive all events regardless of type. This is useful for debugging and general-purpose logging:
$runtime->wiretap(function ($event): void {
    echo get_class($event) . "\n";
});
// @doctest id="9991"

Inference Events

The inference lifecycle dispatches events in this order:

Execution-Level Events

EventWhen DispatchedKey Data
InferenceStartedBeginning of executiondata['executionId'], data['requestId'], data['isStreamed'], data['model'], data['messageCount']
InferenceCompletedEnd of execution (success or failure)data['executionId'], data['isSuccess'], data['finishReason'], data['attemptCount'], data['durationMs'], token-count fields
These events bracket the entire inference operation, including any retry attempts. InferenceCompleted is dispatched exactly once per execution, whether it succeeded or failed.

Attempt-Level Events

Each retry attempt dispatches its own events:
EventWhen DispatchedKey Data
InferenceAttemptStartedBeginning of an attemptexecution ID, attempt ID, attempt number, model
InferenceAttemptSucceededAttempt completed successfullydata['executionId'], data['attemptId'], data['attemptNumber'], data['finishReason'], data['durationMs'], token-count fields
InferenceAttemptFailedAttempt faileddata['executionId'], data['attemptId'], data['attemptNumber'], data['errorMessage'], data['errorType'], data['willRetry'], data['httpStatusCode'], partial token-count fields, data['durationMs']
InferenceUsageReportedAfter a successful attemptdata['executionId'], data['model'], data['isFinal'], token-count fields
When retries are configured, you may see multiple InferenceAttemptStarted/InferenceAttemptFailed pairs before a final InferenceAttemptSucceeded event. The attemptNumber field tracks which attempt is running.

Response Events

EventWhen DispatchedKey Data
InferenceRequestedBefore sending the HTTP requestrequest data
InferenceResponseCreatedAfter receiving and parsing the responsedata['executionId'], data['requestId'], data['responseId'], data['finishReason'], content-length fields, tool-call summary, data['usage']
InferenceFailedOn unrecoverable failureerror details

Streaming Events

EventWhen DispatchedKey Data
StreamFirstChunkReceivedFirst visible delta arrivesexecution ID, timeToFirstChunkMs, receivedAt, model, initial content
PartialInferenceDeltaCreatedEach visible deltadata['executionId'], data['contentDelta']
StreamEventReceivedRaw SSE event receivedraw event data
StreamEventParsedSSE event parsed into a deltaparsed event data
The StreamFirstChunkReceived event is particularly useful for measuring time-to-first-chunk (TTFC), as it includes the requestStartedAt timestamp.

Driver Events

EventWhen DispatchedKey Data
InferenceDriverBuiltAfter the driver is created by the factorydriver class, redacted config, HTTP client class
Sensitive configuration values (API keys, tokens, secrets) are automatically redacted in the InferenceDriverBuilt event payload.

Embeddings Events

The embeddings lifecycle dispatches a smaller set of events:
EventWhen DispatchedKey Data
EmbeddingsDriverBuiltAfter the embeddings driver is createddriver class, config, HTTP client class
EmbeddingsRequestedBefore sending the embeddings requestrequest data
EmbeddingsResponseReceivedAfter receiving the responsedata['model'], data['inputCount'], data['vectorCount'], data['dimensions'], data['usage']
EmbeddingsFailedOn failureerror details

Practical Examples

Logging Token Usage

use Cognesy\Polyglot\Inference\Events\InferenceUsageReported;

$runtime->onEvent(InferenceUsageReported::class, function ($event): void {
    logger()->info('Token usage', [
        'model' => $event->data['model'] ?? null,
        'inputTokens' => $event->data['inputTokens'] ?? 0,
        'outputTokens' => $event->data['outputTokens'] ?? 0,
        'totalTokens' => $event->data['totalTokens'] ?? 0,
    ]);
});
// @doctest id="6c65"

Measuring Time-to-First-Chunk

use Cognesy\Polyglot\Inference\Events\StreamFirstChunkReceived;

$runtime->onEvent(StreamFirstChunkReceived::class, function (StreamFirstChunkReceived $event): void {
    logger()->info("TTFC: {$event->timeToFirstChunkMs}ms for model {$event->model}");
});
// @doctest id="35d4"

Tracking Retry Attempts

use Cognesy\Polyglot\Inference\Events\InferenceAttemptFailed;

$runtime->onEvent(InferenceAttemptFailed::class, function (InferenceAttemptFailed $event): void {
    logger()->warning('Attempt failed', [
        'attemptNumber' => $event->data['attemptNumber'] ?? null,
        'errorMessage' => $event->data['errorMessage'] ?? null,
        'errorType' => $event->data['errorType'] ?? null,
        'willRetry' => $event->data['willRetry'] ?? false,
        'httpStatus' => $event->data['httpStatusCode'] ?? null,
    ]);
});
// @doctest id="3f51"

Monitoring Execution Outcomes

use Cognesy\Polyglot\Inference\Events\InferenceCompleted;

$runtime->onEvent(InferenceCompleted::class, function (InferenceCompleted $event): void {
    logger()->info('Inference completed', [
        'success' => $event->data['isSuccess'] ?? false,
        'finishReason' => $event->data['finishReason'] ?? null,
        'attempts' => $event->data['attemptCount'] ?? 0,
        'totalTokens' => $event->data['totalTokens'] ?? 0,
        'durationMs' => $event->data['durationMs'] ?? 0,
    ]);
});
// @doctest id="4466"

Event Dispatcher

Events are dispatched through an EventDispatcher that implements CanHandleEvents (which extends Psr\EventDispatcher\EventDispatcherInterface). When a runtime is created without an explicit event dispatcher, it creates a default one named 'polyglot.inference.runtime' or 'polyglot.embeddings.runtime'. You can inject a shared event dispatcher to correlate events across multiple runtimes or integrate with your application’s existing event system:
use Cognesy\Events\Dispatchers\EventDispatcher;

$events = new EventDispatcher(name: 'my-app');
$runtime = InferenceRuntime::fromConfig($config, events: $events);
// @doctest id="367d"
The same event dispatcher instance can be shared between inference and embeddings runtimes, allowing a single wiretap listener to observe all Polyglot activity.