Benefits of Streaming
Streaming responses offer several advantages:- Improved User Experience: Display content to users as it’s generated, creating a more responsive interface
- Reduced Latency Perception: Users see the beginning of a response almost immediately
- Progressive Processing: Begin processing early parts of the response while later parts are still being generated
- Handling Long Outputs: Efficiently process responses that may be very long without hitting timeout limits
- Early Termination: Stop generation early if needed, saving resources
Enabling Streaming
Enabling streaming in Polyglot is straightforward - you need to set thestream
option to true
in your request:
stream()
method:
Basic Stream Processing
The most common way to process a stream is to iterate through the partial responses:Understanding Partial Responses
Each iteration of the stream yields aPartialInferenceResponse
object with these key properties:
contentDelta
: The new content received in this chunkcontent
: The accumulated content up to this pointfinishReason
: The reason why the response finished (empty until the final chunk)usage
: Token usage statistics