Documentation Index
Fetch the complete documentation index at: https://docs.instructorphp.com/llms.txt
Use this file to discover all available pages before exploring further.
The Inference class is a thin, immutable facade over InferenceRuntime. It provides the
unified entry point for configuring providers, building requests, and retrieving responses
from any supported LLM.
Creating an Instance
Choose the factory method that matches your level of control:
<?php
use Cognesy\Polyglot\Inference\Inference;
// Use a named preset from your configuration
$inference = Inference::using('openai');
// Use the default provider
$inference = new Inference();
// Explicit configuration object
$inference = Inference::fromConfig($config);
// From a provider instance
$inference = Inference::fromProvider($provider);
// From a fully assembled runtime
$inference = Inference::fromRuntime($runtime);
// @doctest id="08b1"
Presets
The most common pattern is Inference::using(), which loads a named preset from your
configuration files. Each preset defines the provider type, API key, base URL, default
model, and other connection details:
// Use different providers by switching the preset name
$openai = Inference::using('openai');
$anthropic = Inference::using('anthropic');
$ollama = Inference::using('ollama');
// @doctest id="dabd"
Configuring a Request
The fluent API lets you build requests step by step. Every method returns a new immutable
instance, so you can safely branch from a shared configuration:
Messages and Model
$inference = Inference::using('openai')
->withMessages(Messages::fromString('Explain dependency injection in one paragraph.'))
->withModel('gpt-4.1-nano');
// @doctest id="2fd8"
use Cognesy\Polyglot\Inference\Data\ToolChoice;
use Cognesy\Polyglot\Inference\Data\ResponseFormat;
$inference = Inference::using('openai')
->withTools($toolDefinitions)
->withToolChoice(ToolChoice::auto())
->withResponseFormat(ResponseFormat::jsonObject());
// @doctest id="392b"
Streaming and Token Limits
$inference = Inference::using('openai')
->withStreaming(true)
->withMaxTokens(256);
// @doctest id="581e"
Provider-Specific Options
$inference = Inference::using('openai')
->withOptions(['temperature' => 0.5, 'top_p' => 0.9]);
// @doctest id="f332"
The Combined with() Method
When you prefer a single call, use with() to set multiple fields at once:
use Cognesy\Messages\Messages;
use Cognesy\Polyglot\Inference\Data\ToolChoice;
use Cognesy\Polyglot\Inference\Data\ResponseFormat;
$inference = Inference::using('openai')->with(
messages: Messages::fromString('Hello'),
model: 'gpt-4.1-nano',
toolChoice: ToolChoice::auto(),
responseFormat: ResponseFormat::text(),
options: ['temperature' => 0.7],
);
// @doctest id="7f94"
Full Method Reference
| Method | Purpose |
|---|
withMessages(...) | Set conversation messages |
withModel(...) | Override the model |
withTools(...) | Attach tool/function definitions |
withToolChoice(...) | Control tool selection strategy |
withResponseFormat(...) | Specify the response format |
withOptions(...) | Set provider-specific options |
withStreaming(...) | Enable or disable streaming |
withMaxTokens(...) | Set maximum token count |
withCachedContext(...) | Attach reusable cached context |
withRetryPolicy(...) | Configure retry behavior |
withResponseCachePolicy(...) | Configure response caching |
withRequest(...) | Load all fields from an InferenceRequest |
withRuntime(...) | Replace the underlying runtime |
Executing Requests
Response Shortcuts
These methods build the request, execute it, and return the result in a single step:
<?php
use Cognesy\Polyglot\Inference\Inference;
use Cognesy\Messages\Messages;
$inference = Inference::using('openai')
->withMessages(Messages::fromString('What is PHP?'))
->withModel('gpt-4.1-nano');
// Plain text content
$text = $inference->get();
// Full InferenceResponse object (with usage, finish reason, etc.)
$response = $inference->response();
// JSON string extracted from the response
$json = $inference->asJson();
// Parsed JSON as an associative array
$data = $inference->asJsonData();
// JSON from a tool call response
$toolJson = $inference->asToolCallJson();
// Parsed tool call JSON as an array
$toolData = $inference->asToolCallJsonData();
// @doctest id="f51f"
Streaming
To receive partial results as they arrive from the provider:
$stream = Inference::using('openai')
->withMessages(Messages::fromString('Write a short story about a robot.'))
->stream();
foreach ($stream->deltas() as $partial) {
echo $partial->contentDelta;
}
// @doctest id="434e"
The Lazy Handle: PendingInference
If you need to defer execution or pass the handle to another part of your system,
call create() to get a PendingInference instance. Execution happens only when
you call a response method on it:
$pending = Inference::using('openai')
->withMessages(Messages::fromString('Hello'))
->create();
// Nothing has been sent to the provider yet.
// Execution happens here:
$text = $pending->get();
// @doctest id="f68a"
PendingInference exposes the same response methods as Inference: get(),
response(), asJson(), asJsonData(), asToolCallJson(), asToolCallJsonData(),
and stream().
Custom Drivers
To use a custom driver, implement the CanProvideInferenceDrivers contract and pass
it to Inference::using() or Inference::fromConfig() via the drivers parameter:
use Cognesy\Polyglot\Inference\Inference;
$response = Inference::using('custom-provider', drivers: $myDriverRegistry)
->withMessages(Messages::fromString('Hello from a custom driver.'))
->get();
// @doctest id="54c7"