Streaming responses can encounter specific problems.

Symptoms

  • Streams cutting off prematurely
  • Errors during stream processing
  • Partial or incomplete responses

Solutions

  1. Connection Timeouts: Increase timeout settings for streaming responses
<?php
use Cognesy\Http\HttpClient;
use Cognesy\Http\Data\HttpClientConfig;

// Create a custom HTTP client with longer timeouts
$config = new HttpClientConfig(
    requestTimeout: 180,  // 3 minutes for the entire request
    connectTimeout: 10,   // 10 seconds to establish connection
    idleTimeout: 60       // 60 seconds allowed between stream chunks
);

$httpClient = new HttpClient('guzzle', $config);
$inference = new Inference();
$inference->withHttpClient($httpClient);

// Use streaming with the custom client
$response = $inference->create(
    messages: 'Write a long story about a space explorer.',
    options: ['stream' => true]
);

$stream = $response->stream()->responses();
foreach ($stream as $partial) {
    echo $partial->contentDelta;
    flush();
}
  1. Buffer Flushing: Ensure output buffers are properly flushed during streaming
foreach ($stream as $partial) {
    echo $partial->contentDelta;

    // Flush output buffer to ensure content is sent immediately
    if (ob_get_level() > 0) {
        ob_flush();
    }
    flush();
}
  1. Error Handling in Streams: Implement specific error handling for streams
<?php
try {
    $response = $inference->create(
        messages: 'Write a long story.',
        options: ['stream' => true]
    );

    try {
        $stream = $response->stream()->responses();
        $content = '';

        foreach ($stream as $partial) {
            $content .= $partial->contentDelta;
            echo $partial->contentDelta;
            flush();
        }
    } catch (\Exception $streamException) {
        echo "\nStream error: " . $streamException->getMessage() . "\n";

        // If we got a partial response before the error, use it
        if (!empty($content)) {
            echo "Partial content received: " . strlen($content) . " characters\n";
        }
    }
} catch (RequestException $e) {
    echo "Request failed: " . $e->getMessage() . "\n";
}
  1. Fallback to Non-streaming: Implement a fallback to non-streaming mode
<?php
function getResponse(string $prompt, bool $preferStreaming = true): string {
    $inference = new Inference();

    try {
        if ($preferStreaming) {
            // Try streaming first
            $response = $inference->create(
                messages: $prompt,
                options: ['stream' => true]
            );

            $content = '';
            foreach ($response->stream()->responses() as $partial) {
                $content .= $partial->contentDelta;
                // Output can be done here if needed
            }

            return $content;
        }
    } catch (\Exception $e) {
        echo "Streaming failed, falling back to non-streaming mode\n";
    }

    // Fallback to non-streaming
    return $inference->create(messages: $prompt)->toText();
}