Inference and Embeddings facades provide a fluent, immutable interface that handles provider differences behind the scenes.
Inference
TheInference class is the main entry point for LLM interactions. It encapsulates provider complexities behind a unified, fluent interface.
Namespace: Cognesy\Polyglot\Inference\Inference
Creating an Instance
Polyglot offers several ways to create anInference instance depending on how much control you need:
using() or fromConfig() if you have registered custom drivers:
Building a Request
All request methods return a new immutable instance, so you can safely branch configurations:| Method | Purpose |
|---|---|
with(...) | Set multiple parameters at once (messages, model, tools, toolChoice, responseFormat, options) |
withMessages(...) | Set the conversation messages |
withModel(...) | Override the model |
withMaxTokens(...) | Set maximum output tokens |
withTools(...) | Provide tool/function definitions |
withToolChoice(...) | Control tool selection behavior |
withResponseFormat(...) | Request structured output format |
withOptions(...) | Pass additional provider options (merged with existing) |
withStreaming(...) | Enable or disable streaming |
withCachedContext(...) | Set cached context (messages, tools, toolChoice, responseFormat) |
withRetryPolicy(...) | Configure retry behavior via InferenceRetryPolicy |
withResponseCachePolicy(...) | Control response caching via ResponseCachePolicy |
withRequest(...) | Set all parameters from an existing InferenceRequest |
withRuntime(...) | Swap the underlying runtime |
Executing and Reading Results
Shortcuts execute the request and return results directly:create() returns a PendingInference without triggering execution:
Working with Responses
TheInferenceResponse object provides access to all parts of the provider’s response:
Working with Streams
TheInferenceStream provides several ways to consume streaming data:
Embeddings
TheEmbeddings class provides a unified interface for generating vector embeddings across providers.
Namespace: Cognesy\Polyglot\Embeddings\Embeddings
Creating an Instance
Building a Request
| Method | Purpose |
|---|---|
with(...) | Set input, options, and model at once |
withInputs(...) | Set input text(s) to embed (string or array of strings) |
withModel(...) | Override the model |
withOptions(...) | Pass additional provider options |
withRetryPolicy(...) | Configure retry behavior via EmbeddingsRetryPolicy |
withRequest(...) | Set all parameters from an existing EmbeddingsRequest |
withRuntime(...) | Swap the underlying runtime |
Executing and Reading Results
create() returns a PendingEmbeddings:
Working with Responses
TheEmbeddingsResponse object provides access to the embedding vectors:
Registering Custom Drivers
Inference Drivers
Custom inference drivers are registered through theInferenceDriverRegistry and passed to the runtime:
Embeddings Drivers
Custom embeddings drivers are registered through theEmbeddingsDriverRegistry and passed to the runtime: