Inference::using('openai'), Polyglot loads the openai.yaml preset from the
configuration directory and builds a fully wired runtime behind the scenes. Switching providers
is a one-line change.
Switching Providers
Because presets encapsulate all provider details, the same request code works against any supported backend:Understanding Presets vs. Driver Types
It is important to distinguish between a preset name and a driver type. A preset name (e.g.openai, ollama, custom-local) is an arbitrary label for a YAML configuration file.
A driver type (e.g. openai, anthropic, openai-compatible) refers to the underlying
protocol implementation that Polyglot uses to communicate with the API.
Multiple presets can share the same driver. For example, you might create a local-llama preset
that uses the openai-compatible driver pointed at a local Ollama instance, and a together
preset that also uses the openai-compatible driver pointed at the Together AI API.
Polyglot ships with the following driver types:
| Driver | Providers |
|---|---|
openai | OpenAI |
openai-responses | OpenAI (Responses API) |
anthropic | Anthropic |
gemini | Google Gemini (native API) |
gemini-oai | Google Gemini (OpenAI-compatible API) |
azure | Azure OpenAI |
bedrock-openai | AWS Bedrock (OpenAI-compatible) |
a21 | AI21 Labs |
cerebras | Cerebras |
cohere | Cohere |
deepseek | DeepSeek |
fireworks | Fireworks AI |
glm | GLM |
groq | Groq |
huggingface | Hugging Face |
inception | Inception |
meta | Meta |
minimaxi | MiniMaxi |
mistral | Mistral AI |
openrouter | OpenRouter |
openresponses | Open Responses |
perplexity | Perplexity |
qwen | Qwen |
sambanova | SambaNova |
xai | xAI (Grok) |
openai-compatible | Any OpenAI-compatible API (Ollama, Together, Moonshot, etc.) |
Implementing Fallbacks
Polyglot does not impose a fallback policy. Fallback behavior belongs in application code, where you have the context to decide which providers to try and how to handle failures:Cost-Aware Provider Selection
You can route requests to different presets based on the complexity or importance of each task. This pattern lets you reserve expensive models for critical work while using cheaper alternatives for simpler queries:Task-Based Provider Selection
Different providers may excel at different tasks. You can map task types to the most appropriate preset, routing creative writing to one model and code generation to another:Tip: You can combine cost-aware and task-based routing. For example, use a cheap local model for simple factual lookups but route complex creative tasks to a premium cloud provider.
Reusing an Inference Instance
Each call toInference::using() loads the preset YAML and builds a new runtime. If you plan
to issue many requests against the same provider, create the instance once and reuse it:
Inference uses immutable builder methods (each call returns a new copy), sharing a
single instance across concurrent requests is safe.