HttpClientBuilder. In most
cases this is all you need. However, if your application already owns the HTTP transport
concern — for example, you need custom timeouts, middleware, or a shared client instance —
you can build your own HTTP client and inject it into the runtime.
Injecting an HTTP Client
TheInferenceRuntime::fromConfig() method accepts an optional httpClient parameter. Build
an HTTP client with HttpClientBuilder, configure it to your needs, and pass it in:
HTTP Client Configuration Options
TheHttpClientConfig class accepts these parameters:
| Parameter | Default | Description |
|---|---|---|
driver | 'curl' | The underlying HTTP driver (curl, guzzle, symfony) |
connectTimeout | 3 | Maximum time to establish a connection (seconds) |
requestTimeout | 30 | Maximum total request execution time (seconds) |
idleTimeout | -1 | Idle timeout for streaming connections (seconds, -1 = unlimited) |
streamChunkSize | 256 | Size of chunks when reading streaming responses (bytes) |
streamHeaderTimeout | 5 | Timeout for receiving stream headers (seconds) |
failOnError | false | Whether to throw exceptions on HTTP error status codes |
Choosing an HTTP Driver
Polyglot supports multiple HTTP drivers. The defaultcurl driver works without additional
dependencies. If your project already uses Guzzle or Symfony HttpClient, you can reuse them:
Adding Middleware
TheHttpClientBuilder supports a middleware stack for cross-cutting concerns like retries,
circuit breaking, and request logging. Middleware is applied in the order it is added.
Retry Policy
Automatically retry failed requests with exponential backoff:Circuit Breaker
Protect your application from cascading failures by stopping requests to a failing provider:Combining Multiple Middleware
Stack retry and circuit breaker policies together for robust error handling:Custom Middleware
You can also add your own middleware for logging, metrics, or request transformation:Using with Embeddings
The same pattern works for the embeddings runtime. Pass your custom HTTP client when building anEmbeddingsRuntime: