Advanced
Streaming partial updates during inference
Overview
Instructor can process LLM’s streamed responses to provide partial updates that you can use to update the model with new data as the response is being generated. You can use it to improve user experience by updating the UI with partial data before the full response is received.
Example
Now we can use this data model to extract arbitrary properties from a text message.
As the tokens are streamed from LLM API, the partialUpdate
function will be called
with partially updated object of type UserDetail
that you can use, usually to update
the UI.