Cookbook
Cookbook \ Instructor \ Basics
- Basic use
- Basic use via mixin
- Handling errors with `Maybe` helper class
- Modes
- Making some fields optional
- Private vs public object field
- Automatic correction based on validation results
- Using attributes
- Using LLM API connections from config file
- Validation
- Custom validation using Symfony Validator
- Validation across multiple fields
- Validation with LLM
Cookbook \ Instructor \ Advanced
- Context caching (structured output)
- Customize parameters of LLM driver
- Custom prompts
- Customize parameters via DSN
- Using structured data as an input
- Extracting arguments of function or method
- Streaming partial updates during inference
- Providing example inputs and outputs
- Extracting scalar values
- Extracting sequences of objects
- Streaming
- Structures
Cookbook \ Instructor \ Troubleshooting
Cookbook \ Instructor \ LLM API Support
Cookbook \ Instructor \ Extras
- Extraction of complex objects
- Extraction of complex objects (Anthropic)
- Extraction of complex objects (Cohere)
- Extraction of complex objects (Gemini)
- Image processing - car damage detection
- Image to data (OpenAI)
- Image to data (Anthropic)
- Image to data (Gemini)
- Generating JSON Schema from PHP classes
- Generating JSON Schema from PHP classes
- Generating JSON Schema dynamically
- Create tasks from meeting transcription
- Translating UI text fields
- Web page to PHP objects
Cookbook \ Polyglot \ LLM Basics
- Working directly with LLMs
- Working directly with LLMs and JSON - JSON mode
- Working directly with LLMs and JSON - JSON Schema mode
- Working directly with LLMs and JSON - MdJSON mode
- Working directly with LLMs and JSON - Tools mode
- Generating JSON Schema from PHP classes
- Generating JSON Schema from PHP classes
Cookbook \ Polyglot \ LLM Advanced
Cookbook \ Polyglot \ LLM Troubleshooting
Cookbook \ Polyglot \ LLM API Support
Cookbook \ Polyglot \ LLM Extras
Cookbook \ Prompting \ Zero-Shot Prompting
Cookbook \ Prompting \ Few-Shot Prompting
Cookbook \ Prompting \ Thought Generation
Cookbook \ Prompting \ Miscellaneous
- Arbitrary properties
- Consistent values of arbitrary properties
- Chain of Summaries
- Chain of Thought
- Single label classification
- Multiclass classification
- Entity relationship extraction
- Handling errors
- Limiting the length of lists
- Reflection Prompting
- Restating instructions
- Ask LLM to rewrite instructions
- Expanding search queries
- Summary with Keywords
- Reusing components
- Using CoT to improve interpretation of component data
Cookbook \ Instructor \ Extras
Web page to PHP objects
Overview
This example demonstrates how to extract structured data from a web page and get it as PHP object.
Example
In this example we will be extracting list of Laravel companies from The Manifest
website. The result will be a list of Company
objects.
We use Webpage extractor to get the content of the page and specify ‘none’ scraper,
which means that we will be using built-in file_get_contents
function to get the
content of the page.
In production environment you might want to use one of the supported scrapers:
browsershot
scrapingbee
scrapfly
jinareader
Commercial scrapers require API key, which can be set in the configuration file
(/config/web.php
).
<?php
require 'examples/boot.php';
use Cognesy\Auxiliary\Web\Webpage;
use Cognesy\Instructor\Features\Schema\Attributes\Instructions;
use Cognesy\Instructor\Instructor;
use Cognesy\Polyglot\LLM\Enums\OutputMode;
class Company {
public string $name = '';
public string $location = '';
public string $description = '';
public int $minProjectBudget = 0;
public string $companySize = '';
#[Instructions('Remove any tracking parameters from the URL')]
public string $websiteUrl = '';
/** @var string[] */
public array $clients = [];
}
$instructor = (new Instructor)->withConnection('openai');
$companyGen = Webpage::withScraper('scrapfly')
->get('https://themanifest.com/pl/software-development/laravel/companies?page=1')
->cleanup()
->select('.directory-providers__list')
->selectMany(
selector: '.provider-card',
callback: fn($item) => $item->asMarkdown(),
limit: 3
);
$companies = [];
foreach($companyGen as $companyDiv) {
$company = $instructor->respond(
messages: $companyDiv,
responseModel: Company::class,
mode: OutputMode::Json
);
$companies[] = $company;
dump($company);
}
assert(count($companies) === 3);
?>
Assistant
Responses are generated using AI and may contain mistakes.