Adapter Responsibilities
Every inference driver is built from two main translators, each of which may use additional formatters internally:Request Translation
The request adapter converts a PolyglotInferenceRequest into an HttpRequest. It is responsible for:
- Message formatting — mapping Polyglot’s typed
Messages(with roles, content parts, tool calls, and tool results) into the provider’s expected structure - Body formatting — assembling the full request body including model, tools, response format, and mode-specific adjustments
- HTTP request assembly — setting the URL, headers (including authentication), and body
| Class Pattern | Contract | Purpose |
|---|---|---|
*MessageFormat | CanMapMessages | Maps Messages to provider format |
*BodyFormat | CanMapRequestBody | Assembles the full request body |
*RequestAdapter | CanTranslateInferenceRequest | Builds the final HttpRequest |
Response Translation
The response adapter converts raw HTTP responses back into Polyglot data objects:| Class Pattern | Contract | Purpose |
|---|---|---|
*ResponseAdapter | CanTranslateInferenceResponse | Parses responses and stream deltas |
*UsageFormat | CanMapUsage | Extracts token usage from response data |
How They Compose
Each driver wires its adapters together in its constructor. Here is the OpenAI driver as an example:BaseInferenceRequestDriver handles the shared execution logic — sending HTTP requests, reading responses, and parsing event streams. The adapters only need to handle format translation.
The Contracts
Request Side
TheCanTranslateInferenceRequest contract defines a single method:
CanMapRequestBody implementation:
CanMapMessages, which receives typed Messages and returns a provider-native array. Implementations compose a MessageMapper utility for typed iteration instead of duplicating the loop:
OpenAIRequestAdapter receives a CanMapRequestBody (which itself wraps a CanMapMessages), then builds the final HTTP request with URL, headers, and the formatted body:
Response Side
TheCanTranslateInferenceResponse contract handles both synchronous and streaming responses:
toEventBody() method extracts the payload from an SSE line (stripping the data: prefix, detecting [DONE] markers). The fromStreamDeltas() method parses a sequence of those payloads into PartialInferenceDelta objects carrying incremental content, tool call fragments, and usage snapshots.
Usage extraction is handled by CanMapUsage:
InferenceUsage object.
Embeddings Adapters
Embeddings drivers follow the same pattern with their own set of contracts:| Contract | Purpose |
|---|---|
EmbedRequestAdapter | Converts EmbeddingsRequest to HttpRequest |
EmbedResponseAdapter | Converts HttpResponse to EmbeddingsResponse |
CanMapRequestBody | Assembles the embeddings request body |
CanMapUsage | Extracts usage from embeddings response data |
Adding a New Provider
To add support for a new provider, you typically need to create:- A message format class if the provider uses a non-OpenAI message structure
- A body format class to assemble requests with any provider-specific fields
- A request adapter to set the URL, headers, and authentication scheme
- A response adapter to parse responses and streaming events
- A usage format class if token usage is reported differently
- A driver class that wires these adapters together and extends
BaseInferenceRequestDriver
OpenAICompatibleDriver is designed exactly for this purpose — drivers like ollama, together, and moonshot all map to it in the bundled driver registry.