Instructor Package Cheatsheet

Code-verified quick reference for packages/instructor.

Core Flow

use Cognesy\Instructor\StructuredOutput;

class User {
    public string $name;
    public int $age;
}

$user = (new StructuredOutput)
    ->with(
        messages: 'Jason is 28 years old.',
        responseModel: User::class,
    )
    ->get();

Create `StructuredOutput`

use Cognesy\Instructor\StructuredOutput;
use Cognesy\Polyglot\Inference\Config\LLMConfig;

$so = new StructuredOutput();
$so = StructuredOutput::using('openai');
$so = StructuredOutput::using('openai', '/custom/config/path');
$so = StructuredOutput::fromConfig(LLMConfig::fromArray(['driver' => 'openai']));

Request Configuration (`StructuredOutput`)

$so = (new StructuredOutput)
    ->withMessages($messages)
    ->withInput($input)
    ->withRequest($request)
    ->withResponseModel(User::class)
    ->withResponseClass(User::class)
    ->withResponseObject(new User())
    ->withResponseJsonSchema($jsonSchema)
    ->withSystem('You are a precise extractor')    // string|\Stringable
    ->withPrompt('Extract user profile')            // string|\Stringable
    ->withExamples($examples)
    ->withModel('gpt-4o-mini')
    ->withOptions(['temperature' => 0])
    ->withOption('max_tokens', 1200)
    ->withStreaming(true);

Single-call variant:

$so = (new StructuredOutput)->with(
    messages: $messages,
    responseModel: User::class,
    system: '...',                         // string|\Stringable|null
    prompt: '...',                         // string|\Stringable|null
    examples: $examples,
    model: 'gpt-4o-mini',
    options: ['temperature' => 0],
);

Runtime / Provider Setup

use Cognesy\Instructor\Core\RequestMaterializer;
use Cognesy\Instructor\Core\StructuredPromptRequestMaterializer;
use Cognesy\Instructor\StructuredOutput;
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Polyglot\Inference\Config\LLMConfig;
use Cognesy\Instructor\Enums\OutputMode;

$so = StructuredOutput::fromConfig(LLMConfig::fromArray(['driver' => 'openai']));

$runtime = StructuredOutputRuntime::fromConfig(LLMConfig::fromDsn('driver=openai,model=gpt-4o-mini'))
    ->withOutputMode(OutputMode::Json)
    ->withMaxRetries(2);

// Create from LLMProvider
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new());

// Configure with StructuredOutputConfig
$runtime = $runtime->withConfig(StructuredOutputConfig::fromArray([...]));
$runtime = $runtime->withConfig(StructuredOutputConfig::fromDsn('outputMode=json,maxRetries=2'));
$runtime = $runtime->withRequestMaterializer($customMaterializer);
$runtime = $runtime->withRequestMaterializer(new RequestMaterializer()); // legacy/default path
$runtime = $runtime->withRequestMaterializer(new StructuredPromptRequestMaterializer()); // new structured prompt path

$so = (new StructuredOutput)->withRuntime($runtime);

RequestMaterializer remains the legacy/default implementation during rollout. StructuredPromptRequestMaterializer is the new path: it renders one system prompt text, keeps examples inside that system prompt, and sends cached prompt content through InferenceRequest::cachedContext() when used with the current runtime.

Prompt Config References

use Cognesy\Instructor\Config\StructuredOutputConfig;
use Cognesy\Instructor\Enums\OutputMode;

$config = new StructuredOutputConfig(
    modePromptClasses: [
        OutputMode::Json->value => App\Prompts\JsonSystemPrompt::class,
        OutputMode::Tools->value => App\Prompts\ToolsSystemPrompt::class,
    ],
    retryPromptClass: App\Prompts\RetryFeedbackPrompt::class,
    deserializationErrorPromptClass: App\Prompts\DeserializationRepairPrompt::class,
);

YAML-safe shape uses FQN strings:

modePromptClasses:
  json: 'App\\Prompts\\JsonSystemPrompt'
  tool_call: 'App\\Prompts\\ToolsSystemPrompt'

retryPromptClass: 'App\\Prompts\\RetryFeedbackPrompt'
deserializationErrorPromptClass: 'App\\Prompts\\DeserializationRepairPrompt'

OutputMode Enum

use Cognesy\Instructor\Enums\OutputMode;

OutputMode::Tools;        // 'tool_call' — default, uses tool/function calling
OutputMode::Json;         // 'json' — JSON mode
OutputMode::JsonSchema;   // 'json_schema' — structured outputs / JSON schema mode
OutputMode::MdJson;       // 'md_json' — extract JSON from Markdown code blocks
OutputMode::Text;         // 'text' — plain text extraction
OutputMode::Unrestricted; // 'unrestricted' — no constraints on output format

Pipeline Overrides

use Cognesy\Instructor\StructuredOutput;
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Polyglot\Inference\Config\LLMConfig;

$runtime = StructuredOutputRuntime::fromConfig(LLMConfig::fromDsn('driver=openai,model=gpt-4o-mini'))
    ->withValidator($validator)
    ->withTransformer($transformer)
    ->withDeserializer($deserializer)
    ->withExtractor($extractor);

$so = (new StructuredOutput)->withRuntime($runtime);

Execution

$result = $so->get();          // parsed value
$response = $so->response();   // StructuredOutputResponse
$raw = $so->inferenceResponse(); // InferenceResponse
$stream = $so->stream();       // StructuredOutputStream

$pending = $so->create();
$result = $pending->get();
$response = $pending->response();
$raw = $pending->inferenceResponse();
$stream = $pending->stream();
$array = $pending->toArray();
$json = $pending->toJson();
$jsonObject = $pending->toJsonObject();
$execution = $pending->execution();

PendingStructuredOutput is a lazy handle:

no provider call happens until one of the read methods above is used
get(), response(), inferenceResponse(), and stream() coordinate one execution
mutable lifecycle bookkeeping sits behind the internal execution session, not on the facade-facing handle
long-lived streaming state stays in the dedicated stream/state objects

Type helpers (available on StructuredOutput and PendingStructuredOutput):

$so->getString();
$so->getInt();
$so->getFloat();
$so->getBoolean();
$so->getObject();
$so->getArray();

Additional type helper (only on PendingStructuredOutput):

$pending->getInstanceOf(User::class);

Streaming (`StructuredOutputStream`)

$stream = $so->withStreaming()->stream();

foreach ($stream->partials() as $partial) {
    // every parsed partial update
}

foreach ($stream->sequence() as $sequenceUpdate) {
    // one update per completed sequence item (Sequence responses only)
}

foreach ($stream->responses() as $responseUpdate) {
    // StructuredOutputResponse, partial or final
}

foreach ($stream->getIterator() as $rawUpdate) {
    // raw emitted StructuredOutputResponse snapshots
}

$latestValue = $stream->lastUpdate();
$latestResponse = $stream->lastResponse();
$usage = $stream->usage();
$finalValue = $stream->finalValue();
$finalResponse = $stream->finalResponse();
$finalRaw = $stream->finalInferenceResponse();

lastResponse() / finalResponse() return StructuredOutputResponse. Use ->inferenceResponse() when you need the nested raw InferenceResponse.

Response Model Helpers

`Sequence`

use Cognesy\Instructor\Extras\Sequence\Sequence;

$people = (new StructuredOutput)
    ->with(
        messages: $text,
        responseModel: Sequence::of(Person::class),
    )
    ->get();

$count = $people->count();
$first = $people->first();
$last = $people->last();
$item = $people->get(0);
$all = $people->all();

`Scalar`

use Cognesy\Instructor\Extras\Scalar\Scalar;

$name = (new StructuredOutput)
    ->with(messages: $text, responseModel: Scalar::string('name'))
    ->get();

$age = (new StructuredOutput)
    ->with(messages: $text, responseModel: Scalar::integer('age'))
    ->get();

$isAdult = (new StructuredOutput)
    ->with(messages: $text, responseModel: Scalar::boolean('isAdult'))
    ->get();

$sentiment = (new StructuredOutput)
    ->with(messages: $text, responseModel: Scalar::enum(Sentiment::class, 'sentiment'))
    ->get();

`Maybe`

use Cognesy\Instructor\Extras\Maybe\Maybe;

$maybeUser = (new StructuredOutput)
    ->with(messages: $text, responseModel: Maybe::is(User::class))
    ->get();

if ($maybeUser->hasValue()) {
    $user = $maybeUser->get();
}

$error = $maybeUser->error();

Output Controls

use Cognesy\Instructor\Extras\Scalar\Scalar;
use Cognesy\Instructor\StructuredOutput;
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Instructor\Enums\OutputMode;
use Cognesy\Polyglot\Inference\Config\LLMConfig;

$runtime = StructuredOutputRuntime::fromConfig(LLMConfig::fromDsn('driver=openai,model=gpt-4o-mini'))
    ->withOutputMode(OutputMode::Json)
    ->withMaxRetries(3)
    ->withDefaultToStdClass(true);

$so = (new StructuredOutput)->withRuntime($runtime);

$asArray = (new StructuredOutput)->intoArray();
$asClass = (new StructuredOutput)->intoInstanceOf(User::class);
$asObject = (new StructuredOutput)->intoObject(Scalar::integer('rating'));

Cached Context

$result = (new StructuredOutput)
    ->withCachedContext(
        messages: $longContext,
        system: 'You know the full context',
    )
    ->with(
        prompt: 'Extract only contact details',
        responseModel: Contact::class,
    )
    ->get();

Examples API

use Cognesy\Instructor\Extras\Example\Example;

$result = (new StructuredOutput)
    ->withExamples([
        Example::fromText(
            'John Doe, john@example.com',
            ['name' => 'John Doe', 'email' => 'john@example.com'],
        ),
    ])
    ->with(messages: $text, responseModel: Contact::class)
    ->get();

Events

use Cognesy\Instructor\StructuredOutput;
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Instructor\Events\StructuredOutput\StructuredOutputRequestReceived;
use Cognesy\Polyglot\Inference\LLMProvider;

$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
    ->onEvent(StructuredOutputRequestReceived::class, function (object $event): void {
        // use $event->data['requestId'] / ['executionId'] for correlation
    })
    ->wiretap(function (object $event): void {
        // handle all events
    });

$result = (new StructuredOutput)
    ->withRuntime($runtime)
    ->with(messages: $text, responseModel: User::class)
    ->get();

Testing

Deterministic test seams:

Tests\Support\FakeInferenceDriver
- queue sync InferenceResponse fixtures or streaming PartialInferenceDelta batches
- best for most unit and regression tests inside packages/instructor
Tests\MockHttp
- builds an HTTP client around MockHttpDriver
- use when provider adapter and HTTP response shape still matter
Tests\Integration\Support\ProbeStreamDriver
- observation helper for streaming immediacy and call-count assertions
Tests\Support\ProbeIterator
- explicit iterator helper for controlled delta emission in integration tests

​Instructor Package Cheatsheet

​Core Flow

​Create StructuredOutput

​Request Configuration (StructuredOutput)

​Runtime / Provider Setup

​Prompt Config References

​OutputMode Enum

​Pipeline Overrides

​Execution

​Streaming (StructuredOutputStream)

​Response Model Helpers

​Sequence

​Scalar

​Maybe

​Output Controls

​Cached Context

​Examples API

​Events

​Testing