Inference

Overview

Inference class offers access to LLM APIs and convenient methods to execute model inference, incl. chat completions, tool calling or JSON output generation.

LLM providers access details can be found and modified via /config/llm.php.

Examples

Simple text inference

Simplified inference API uses the default connection for convenient ad-hoc calls. Default LLM connection can be configured via config/llm.php.

<?php
use Cognesy\Polyglot\LLM\Inference;

$answer = Inference::text('What is capital of Germany');

echo "USER: What is capital of Germany\n";
echo "ASSISTANT: $answer\n";

Regular synchronous inference

Regular inference API allows you to customize inference options, specific for given LLM provider. Most providers options are compatible with OpenAI API.

<?php
use Cognesy\Polyglot\LLM\Inference;

$answer = (new Inference)
    ->withConnection('openai') // optional, default is set in /config/llm.php
    ->create(
        messages: [['role' => 'user', 'content' => 'What is capital of France']],
        options: ['max_tokens' => 64]
    )
    ->toText();

echo "USER: What is capital of France\n";
echo "ASSISTANT: $answer\n";

Streaming inference results

Inference API allows streaming responses, which is useful for building more responsive UX as you can display partial responses from LLM as soon as they arrive, without waiting until the whole response is ready.

<?php
use Cognesy\Polyglot\LLM\Inference;

$stream = (new Inference)
    ->create(
        messages: [['role' => 'user', 'content' => 'Describe capital of Brasil']],
        options: ['max_tokens' => 128, 'stream' => true]
    )
    ->stream()
    ->responses();

echo "USER: Describe capital of Brasil\n";
echo "ASSISTANT: ";
foreach ($stream as $partial) {
    echo $partial->contentDelta;
}
echo "\n";