Image to data (Gemini)
Basics
- Basic use
- Basic use via mixin
- Handling errors with `Maybe` helper class
- Modes
- Making some fields optional
- Private vs public object field
- Automatic correction based on validation results
- Using attributes
- Using LLM API connections from config file
- Validation
- Custom validation using Symfony Validator
- Validation across multiple fields
Advanced
- Context caching (text inference)
- Context caching (structured output)
- Customize parameters of LLM driver
- Custom prompts
- Using structured data as an input
- Extracting arguments of function or method
- Streaming partial updates during inference
- Providing example inputs and outputs
- Extracting scalar values
- Extracting sequences of objects
- Streaming
- Structures
Troubleshooting
LLM API Support
Extras
- Extraction of complex objects
- Extraction of complex objects (Anthropic)
- Extraction of complex objects (Cohere)
- Extraction of complex objects (Gemini)
- Embeddings
- Image processing - car damage detection
- Image to data (OpenAI)
- Image to data (Anthropic)
- Image to data (Gemini)
- Working directly with LLMs
- Working directly with LLMs and JSON - JSON mode
- Working directly with LLMs and JSON - JSON Schema mode
- Working directly with LLMs and JSON - MdJSON mode
- Working directly with LLMs and JSON - Tools mode
- Prompts
- Generating JSON Schema from PHP classes
- Generating JSON Schema dynamically
- Simple content summary
- Inference and tool use
- Create tasks from meeting transcription
- Translating UI text fields
- Web page to PHP objects
Image to data (Gemini)
Overview
This is an example of how to extract structured data from an image using Instructor. The image is loaded from a file and converted to base64 format before sending it to OpenAI API.
The response model is a PHP class that represents the structured receipt information with data of vendor, items, subtotal, tax, tip, and total.
Scanned image
Here’s the image we’re going to extract data from.
Example
<?php
$loader = require 'vendor/autoload.php';
$loader->add('Cognesy\\Instructor\\', __DIR__ . '../../src/');
use Cognesy\Instructor\Clients\Gemini\GeminiClient;
use Cognesy\Instructor\Enums\Mode;
use Cognesy\Instructor\Extras\Image\Image;
use Cognesy\Instructor\Instructor;
use Cognesy\Instructor\Utils\Env;
class Vendor {
public ?string $name = '';
public ?string $address = '';
public ?string $phone = '';
}
class ReceiptItem {
public string $name;
public ?int $quantity = 1;
public float $price;
}
class Receipt {
public Vendor $vendor;
/** @var ReceiptItem[] */
public array $items = [];
public ?float $subtotal;
public ?float $tax;
public ?float $tip;
public float $total;
}
$client = new GeminiClient(
apiKey: Env::get('GEMINI_API_KEY'),
);
$receipt = (new Instructor)->withClient($client)->respond(
input: Image::fromFile(__DIR__ . '/receipt.png'),
responseModel: Receipt::class,
prompt: 'Extract structured data from the receipt. Return result as JSON following this schema: <|json_schema|>',
mode: Mode::Json,
options: ['max_tokens' => 4096]
);
dump($receipt);
assert($receipt->total === 169.82);
?>
On this page