Overview
OpenAI’s Responses API is their new recommended API for inference, offering improved performance and features compared to Chat Completions. Key features:- 3% better performance on reasoning tasks
- 40-80% improved cache utilization
- Built-in tools: web search, file search, code interpreter
- Server-side conversation state via
previous_response_id - Semantic streaming events
- OutputMode::Tools (supported)
- OutputMode::Json (supported)
- OutputMode::JsonSchema (recommended)
- OutputMode::MdJson (fallback)